Text Signals Relevance Improvement for Full Text Search

Jan Hnizdil
The classical task of ranking the SERP for a given user query is still very important. This thesis is tries to find new text relevance signals. First, it reviews the major Learning To Rank algorithms, evaluation metrics and commonly used text signals known from literature. Second, a newly designed system for testing and evaluation of new signals is described. Finally, it suggests new signals and presents the experimental results in comparison with currently used signals.

Detection of Anomalies in User Behaviour

Tomáš Vyskočil
The Internet competition grows everyday. Some of the business owners try to gain advantage using illegal methods. For example, they try to deplete competitions advertising budget through excessive clicking on sponsored results, they try to full suggesters to gain higher rank for their keywords etc. Such anomalous behavior often leads to a worsened user experience. Illegal behaviour is hard to detect. This thesis introduces three unsupervised models for atypical user detection. Classification is based on behavioural patterns, clicks, query etc. The achieved results are presented.


Automatic Name and Snippet Generation of Web pages with Unknown Content

Jonáš Amrich
A growing number of web pages is short on text, rich on multimedia, highly interactive (JavaScript), or their content can not be downloaded for other reasons. These web pages may still be valuable for users and search engines need to return them. This thesis explores the automatic generation of descriptive snippets for SERP. New methods are proposed with performance evaluation. Exemplary snippets are then generated.


The Users Categorization Based on the Browsing History

Dušan Jenčík
This theses is a contribution to the Internet users categorization from click streams. Clustering and genarative model estimations methods are studied. The parallel Latent Semantic Analysis has been selected as the best performing for our set of click streams. The resulting topics are defined by distributions of websites and every user is described by a distribution of topics. The topic semantics has been estimated using the DMOZ database a categories.


Automatic Text Summarization

Šimon Hlaváč
This work presents the basic methods used in automatic text summarization and genetic algorithms. Furthermore, system of automatic summarization based on graph structures and Markov chains was designed, implemented and properly tested. This study also discusses learning of proper setting of importance weights of individual methods used in summarization by naive approach and genetic algorithms, which were also implemented and properly tested. System also includes possibility of parallel processing and use of caching to speed up its process.

Phishing Email Detection in Czech Language for

Vít Listík
Phishing emails are a growing problem for all internet companies. The main goal of this work is the phishing automatic detection. The first part reviews the currently used methods. Then the phishing data set is described along with the training. The core part of the work introduces a new two stage solution. To reduce the computational cost, the first part, a fast filtering reduces dramatically the number of suspiciousness emails. Filtering is based on 30 traffic signals. Second phase uses a 25 features based decision tree classifier. Finally, a results discussion concludes the whole work. Some of the ideas of the solutions are already in use in a seznam emal product.


Automatic performance-model driven cloud application deployment

Kábrt Tomáš
The goal of this thesis is to analyze current state of IaaS (Infrastructure as a Service) cloud technologies, review possibilities of automatic provisioning with Chef and deployment of a two-tier application which consists of phpMyAdmin and Percona Cluster. This application was benchmarked and a performance model was created. The ideal number of nodes for the application was cal- culated by PDQ based on the performance model. ADT web application was created to offer an easy way to automatically deploy nodes with required soft- ware in the cloud.

Monitoring of private cloud

Židek Matěj
The goal of this bachelor thesis is to create an enhanced monitoring system for a cloud platform Open-Nebula that would store data for a duration of at least one year and present them in graphs in a user-friendly environment. The final solution uses one2influx daemon to gather monitoring data from OpenNebulas XML-RPC API and saves them to the InfluxDB database. The visualization is done using Grafana tool and after a customization of one2influx the solution is applicable to other cloud solutions as well, e.g. OpenStack. At the moment the solution is in a state of limited functionality. The problem is caused by an undebugged version of InfluxDB. This work thus serves more as a demonstration of what can be achieved with the use of InfluxDB in a short period of time.



Learning to Rank Algorithms

Marek Modrý
The need for rapid and accurate search results is growing with the amount of data on the Internet. data, Learning to Rank (LTR) are the basic algorithms for ranking the results. They are supervised machine learning methods. This thesis provides an exhaustive listing and analysis of current state-of-the-art LTR algorithms and it describes the necessary background for this work. Performance measures and available datasets are described. After many experiments the LambdaMART was evaluated as the potentially best LTR algorithm, our own implementation of the algorithm is introduced. This thesis provides a guide to researchers interested in this topic, but it also opens many new questions and issues.



Design of Probabilistic Models for Text Input Correction

Antonín Novák
This bachelor thesis introduces a new algorithm for a Search Query Spelling Correction System. It is based on Learning to Rank approach and allows to use a large number of various signals leading to an improved accuracy. Its performance will be tested against the conventional solution – the noisy channel model. The new system was developed on a Czech Internet search query set, but the feature vector structure and the algorithm can be easily adapted for any other human language when sufficient data is available. We will describe the algorithm’s details, the training set and other datasets that were used. In the end we will present final results. In cooperation with The results of the thesis were used in the production runtime environment .


Malware detection

Ondřej Pluskal
In collaboration with AVG, one of the most renown anti-mallware companies, we have designed classifier for the client part of a mallware detection. Special care had to be taken to deal with sensitivity on minimizing false-positive rate. The other difficulty is that the algorithms must be very efficient, because the number of samples (data) is enormous. Thanks go to Vojtěch Franc, who helped consulting this research. In cooperation with The results of the thesis are being implemented to the AVG runtime product.


Contact-less heart beat rate measurement on iPad

Jan Plešek
In collaboration with AVG, one of the most renown anti-mallware companies, we have designed classifier for the client part of a mallware detection. Special care had to be taken to deal with sensitivity on minimizing false-positive rate. The other difficulty is that the algorithms must be very efficient, because the number of samples (data) is enormous. Thanks go to Vojtěch Franc, who helped consulting this research. In cooperation with The results of the thesis are being implemented to the AVG runtime product.


Effective Scaling in Private IaaS

Karol Danko
Scalability is one of the main reasons why cloud services are slowly becoming mainstream in private applications. While proven advantageous in public cloud services, a sufficiently effective and universal solution for scaling private cloud is not yet available. Goal of our project is to develop platform-independent solution which will allow easily and effectively scaling private IaaS.


Karma JavaScript TestRunner

Vojtěch Jína
Karma is a TDD tool for JavaScript developers. JavaScript de facto standard for web development is a very dynamic language without static typing. There is no compiler that could catch mistakes like misspelling a variable name or calling a non existing method on an object - developers have to actually run the code to catch these issues. Therefore Testing is absolutely necessary for professional development Karma is a test runner, that helps web developers to get more productive and effective by making automated testing simpler and faster. It has been successfully used on many projects, including companies such as Google and YouTube. This thesis describes the design and implementation of Karma, and the reasoning behind them. Partly supported by, Mountain View.


Client for university information system (KOS) on the Android platform

Martin Šesták
Bachelor thesis aims to explain, describe and show the example of programming mobile applications for the Android operating system. It explains the history, system architecture, development environment and used components. Furthermore, it focuses on the analysis of current solutions, applictaion design and its implementation, which describes the technology used to develop applications. Finally, the work will focus on the description and implementation of testing mobile application


Service for Information Extraction

Miloslav Pojman
The Internet represents a rich source of information on almost any topic, which is often scattered, in unsuitable form, or of questionable reliability. An educated visitor may want to make their own analysis of the data, for example using spreadsheet software, but in most cases his or her only option is to manually copy-paste the individual values, which is a long tiring and unproductive work. This thesis deals with semi-automatic and automatic information extraction from web pages. In the first part it reviews existing possibilities – it studies both theoretical algorithms designed in academic papers, and commercially available software. In the second part development of a custom solution is documented. The key feature of the created tool is a spreadsheet-like interface, which WYSIWYG approach is opposite to imperative definition of extraction rules common in current solutions. The program is learning in background and based on entered values it suggests data for automatic extraction.


Distributed Database for Mobile Devices

Burian Vladimír
The goal of this thesis is to investigate possibilities to replicate data between columnar NoSQL database used as a central storage in an IaaS infrastructure and arbitrary number of Android mobile devices with relational databases. Proposed system is multi-tenant. NoSQL DB serves as a storage for arbitrary number of client databases. And proposed system allows their efficient replication/sharing among arbitrary number of devices. It could be seen as a service synchronizing user data between different devices or as a service providing synchronized database for collaborative work of multiple users. The thesis discusses NoSQL DB Cassandra as it was used to develop the system, characteristics of the system as a whole and methods to efficiently synchronize database changes and meaningfully resolve collisions. Emphasis in placed on a system performance and resistance to failures, especially those caused by an unreliable mobile device connection. Both the server and the client is implemented and their basic principles are described in detail.


Voting display

Adam Hořčica
This master's thesis discusses implementation of system for estimation of happiness and responsiveness of audience within a lecture called Boring-o-meter. Implementation follows after theoretical discussion of communication architecture and technologies, sensors, actuators, and embedded systems with respect to the Internet of Things. System is based on real-time voting through web interface. Element of system is mechanical device – gauge which is situated in the lecture room and presents current value to audience. This system was tried out in practice. Conclusion is presented in the end of thesis.



RESTful Helpdesk appplication (in Czech)

Martin Prokš
The thesis deals with design and implementation of a helpdesk application. The theoretical part introduces to the problems of designing fat client SaaS application. The practical part focuses on the implementation of the application with combination of Google AppEngine and Spring technologies on server and Google Web Toolkit framework for client. In addition the project tools installation and configuration is described. Finally, the complete set of testing methods is described and used to test all aspects of helpdesk app.


Automatic Deployment to PaaS Cloud

Petr Michalička
This thesis analyzes and evaluates the potential of current PaaS (Platform as a Service) cloud solutions for the migration and automatic deployment of popular open source web applications like WordPress or Drupal into cloud environment. With the use of suitable PaaS clouds the automatic deployment SaaS (Software as a Service) web application called UpCf was developed. The UpCf enables simple deployment of several popular applications into cloud environment for the user without any programming skills. The thesis summarizes the main migration problems of typical web application and suggests their solution in order to future integration with PaaS clouds and developed SaaS application.

Rapid Application Development of RESTful Servers

Ondřej Šťastný
The main objective is to create a system that would help developers rapidly create web applications adhering to the principles of Representational State Transfer architecture using only client-side code. In the thesis we identify Node.js and JavaScript programming languages as the perfect candidates for implementing such system, given the fact that JavaScript is ubiquitous on current generation of internet enabled devices. We define SPD schema as a novelty approach for defining data model and server interface, leveraging existing protocols and minimizing number of protocols used in order to simplify the third party development. We verify our findings by implementing a proof-of-concept SPD application server and two reference applications. Using the scoring mechanism established in our research, we find that the newly proposed system outperforms the current state of the art.


Gestures detection and NFC for Android OS

Tomáš Tunys
This thesis presents a design of a gesture recognition system for hand-held devices with An- droid OS that is based on discrete hidden Markov models. As a gesture here is considered a hand movement with a device, not a hand-drawn shape on a touch screen, or a movement captured by a camera. In addition to hidden Markov model classifier, the thesis contains an analysis of movement detection sensors supported by Android OS, description of pre- processing steps applied on data from 3D accelerometer, namely: automatic gesture data segmentation, removal of the effects of the way the device is held and how vigorously (fast) the gesture is made from the data. The result of the thesis is an implementation of the proposed gesture recognition system in Java, as well as the evaluation of the implemented recognizer on the set of chosen test gestures in both user dependent and independent cases.


Mobile sales support applications

Vratislav Zima
The thesis focuses on the following problems: finding an appropriate technology for the web server and an Android client, designing API between the client and the server for RESTful architecture, selecting and configuring an appropriate database for the client and the server, finding the appropriate connection to a Facebook API, Using selected technologies Android client-server application is implemented allowing users to authenticate to the application, take photos, read and view QR codes, store data on server and Facebook. Server supports basic users and data administration. In addition, tests for completed application as well as user tests are described.


Interfacing an Android native app to PHP web server (in Czech)

Martin Falta
This thesis designes an Android application for an existing portal It shows step by step how to connect mobile client to an existing PHP based servers. First a wrapper for the server part is discussed, designed and implemented. The wrapper is converting the PHP based server to REST architecture. Next new REST API for the server client implementation is designed and implemented. The last part describes the usual steps for designing, developing, implementing and testing standard Android application. The application allows users to interact with the portal containing kids pictures and implements the standard CRUD operations.


REST API testing (Testování REST API)

Ludvík Haltuf
This thesis analyses the REST architectural style and designs an application for testing the REST API compliance. First, the problem of the RESTful architecture is formulated. Next part suggests practices and implementation steps leading to a RESTful application design. The currently available API testing tools for general as well as for specifically REST APIs are compared and discussed. Finally, the design and implementation of a new REST API testing application is described.


Time tickets (in Czech)

Tomáš Lucovič
The thesis is solving the time tracking problem on software projects. An add on application for Pivotal Tracker is developed. It implements user-friendly and easy UI for working tasks time tracking with accent on maximum accuracy and minimum users requirements. In addition, we plan to extend the functionality to smart phones and tablets.


Automated Trading System Based on Statistical Arbitrage Strategy

Michal Pavlásek
The goal of this project is an algorithmic analysis of historical financial data, identification of viable pairs and detection of trading impulses, resulting in a highly automated trading software. The paper describes the problem, analyses the requirements and proposes design for such software. Implementation and project management are described in the end.



Martin Petrus
Helpdesk is an application used designed to communicate with customers and reply for their questions and needs. Application can be used on variety of different project. Benefits in comparison with regullar call center is photo and location attachment, which can save time of the call center or enables to solve problems that couldn’t be solved by using classic telephone dialog. Operator grabs the ticket of the client and solves his problem, then replies on the ticket with the answer message or replies with request for more detailed information. Operator can use both mobile phone client application or web application. Client application is being developed for android platform, server part is build on Google appengine.



Jan Kolařík
czSMS used to be the first full-featured Android application for sending free texting within the Czech Republic. The author abandoned the project more than a year ago for the lack of time but now he decided to revive it and make it available again, this time for a variety of mobile platforms (Android, Windows Phone, Samsung Bada, Symbian, webOS). It will allow users to send free (sent using mobile data transfers or Wi-Fi) text messages to all phone numbers within the Czech Republic, manage message history, display texting statistics.


Geographic Proximity Detector

Tomáš Gogár
Geographic proximity detector (GPD) is a tool which fires an event when the device visits particular geographic position. In this project we are developing GPD for smartphones in order to reach the highest possible accuracy and reliability without need of any additional device at checked position. Although we started to develop this tool for one particular mobile application, it can be used in many other applications as well.