The Internet represents a rich source of information on almost any topic, which is often scattered, in unsuitable form, or of questionable reliability. An educated visitor may want to make their own analysis of the data, for example using spreadsheet software, but in most cases his or her only option is to manually copy-paste the individual values, which is a long tiring and unproductive work. This thesis deals with semi-automatic and automatic information extraction from web pages. In the first part it reviews existing possibilities – it studies both theoretical algorithms designed in academic papers, and commercially available software. In the second part development of a custom solution is documented. The key feature of the created tool is a spreadsheet-like interface, which WYSIWYG approach is opposite to imperative definition of extraction rules common in current solutions. The program is learning in background and based on entered values it suggests data for automatic extraction.