How to Extract Data from Websites: Scraping Tools, DIY or DaaS

How to Extract Data from Websites: Scraping Tools, DIY or DaaS

This is part 2 of our guide to web data extraction. Read part 1 to learn about the questions to ask before you start, or download the complete Web Data Extraction Playbook […]

Web Data Extraction Guide: 11 Questions to Ask

Web Data Extraction Guide: 11 Questions to Ask

The following is an excerpt from our new Web Data Extraction Playbook. We’ll be publishing the second part next week, or you can grab the full guide here. The internet has become […]

5 Great Reasons to Meet Us at Strata

5 Great Reasons to Meet Us at Strata

If you’re visiting this year’s Strata Data Conference in New York, you can find us at Booth #P17, and absolutely should. Here are 5 reasons why our (modest) booth is probably going […]

A Judge Just Ordered LinkedIn to Allow Scraping – Here's Why

A Judge Just Ordered LinkedIn to Allow Scraping – Here's Why

When is it okay to grab data from someone else’s website, without their explicit permission? A new ruling by a federal judge in California might have dramatic implications on this question, and […]

Web Data Visualization of The Hillary Clinton Top 100 Network Graph

Web Data Visualization of The Hillary Clinton Top 100 Network Graph

The web data business can get pretty tricky, especially when your job is to extract the broadest possible dataset from the planet’s biggest database. Last week, Webz.io CEO Ran Geva ran a […]

Should you buy crawled web data or build your own solution?

Should you buy crawled web data or build your own solution?

In a technologically driven environment, the temptation to develop a proprietary web crawling solution is virtually irresistible. Our latest report examines the true cost of computing and software development resources required to deliver a data […]

The Race to Achieve 100% Coverage of the Web

The Race to Achieve 100% Coverage of the Web

In our new report, we deconstruct the all-too-familiar race to achieve 100% coverage of the web. Data acquisition efforts usually rely on one of three approaches – build an internal web crawling […]

How to Create a Custom RSS Feed for Content Monitoring

How to Create a Custom RSS Feed for Content Monitoring

Imagine that you had the ability to track what’s being said, felt and published about a given topic, industry or brand. Whether you’re in marketing, sales, search engine optimization, management or just […]

The Top 10 Data & Analytics Articles of 2015

The Top 10 Data & Analytics Articles of 2015

The online world of data and analytics is fast approaching epic portions. It’s easy to get overwhelmed. Why? Because, not only has big data been big business in 2015 … but posts, […]

To crawl or not to crawl, that is the question

In order to write an efficient crawler, you must be smart about the content you download. When your crawler downloads an HTML page it uses bandwidth, memory and CPU, not only its […]

Vertical Aggregation and Pattern Matching Crawlers

Vertical Aggregation and Pattern Matching Crawlers

After bashing various crawling techniques, I would like to describe the technique we use here, at Webz.io, a technology that was developed over the past 8 years. Our crawlers were developed with […]