Seznam.cz: The Czech Internet Giant Offering High-Quality Services
Seznam.cz is the leading Internet company in the Czech Republic, providing a range of services including web search, specialized search, email, news, entertainment, online maps, and an advertising system. With over 1000 employees and daily visits from 2.4 million people, Seznam.cz competes with Google and offers unique features like more detailed maps and a popular free email service. The company's revenue of 2.8 billion CZK showcases its strong presence in the market, with a diverse user base and advanced search engine architecture. Seznam.cz stands out for its commitment to quality and innovation in the online space.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Seznam.cz The Czech number one Internet company Ji Materna, Head of Research
What is Seznam.cz Internet portal with tens of high-quality services: Web search (web search engine successfully competing with Google) Specialized search (Czech companies, e-shops) E-mail (the most popular free e-mail in the Czech market) News (covering business, politics, lifestyle, sport, whether, TV schedules, etc.) Entertainment (video and music online streaming) On-line maps (more detailed than Google maps) Sklik.cz (advertising system) And others @JiriMaterna www.seznam.cz
Seznam.cz in numbers More than 1000 employees Revenue 2.8 billion CZK (108 mil. EUR) 2.4 million people visit Seznam.cz every day 1.5 billion crawled web pages -- 45 % English -- 37 % Czech -- 7.7 % Slovak -- 2.3 % German -- 8 % Others 500 queries per second in peak hours @JiriMaterna www.seznam.cz
Search engine architecture @JiriMaterna www.seznam.cz
Query Expander Query understanding Graph representation: - AND - OR - optional - other relations @JiriMaterna www.seznam.cz
Search aggregators Deduplication Document sub-results SERP restrictions Caching @JiriMaterna www.seznam.cz
Ranking RC-Rank Boosted regression oblivious trees Hundreds of features Our own quality measure @JiriMaterna www.seznam.cz
Index & Indexer Indexing: complete, daily, fresh Data structures: word barrel stores the inverted index document barrel stores document features title barrel stores processed web pages content and metadata others query site barrel, site barrel, link barrel, qds barrel, query url barrel, @JiriMaterna www.seznam.cz
Downloader & document database Hadoop, Giraffe, Yarn 50 mil. documents every day 1.5 bil. documents out of 50 bil. known documents stored duplicity detection @JiriMaterna www.seznam.cz
Possible models of cooperation Joint projects Providing our technology Sharing data (MetaCentrum) @JiriMaterna www.seznam.cz
Thank you for your attention. Ji Materna, Head of Research, jiri.materna@firma.seznam.cz @JiriMaterna www.seznam.cz