5 Ways to Measure the Impact of Crawled Web Data on Your Business
The analysis you provide is only as good as the raw data you start with. Although data from the open...
Calling all (almost) Kimono Labs Developers to Migrate to Webz.io
Kimono Labs made an announcement today that it has been acquired by Palantir. Unfortunately Kimono Labs users will only have...
Extracting Data from Forums: 3 Sources to Discover What Your Market Really Thinks
Robert Collier, the great ad man of the early 20th century, once summarized the secret of all effective marketing as...
Article’s publication date extractor – an overview
A few days ago I’ve released an open source Python module that provides you with a simple way to extract...
To crawl or not to crawl, that is the question
In order to write an efficient crawler, you must be smart about the content you download. When your crawler downloads...
Dead simple {for devs} python crawler (script) for extracting structured data from any website into CSV
On my previous post I wrote about a very basic web crawler I wrote, that can randomly scour the web and...
Tiny basic multi-threaded web crawler in Python
If you need a simple web crawler that will scour the web for a while to download random site’s content – this code...
How we quadrupled the performance of Elasticsearch
Well, that’s a misleading title. We actually quadrupled the performance of our brand monitoring alert system that uses Elasticsearch’s Percolator,...
Building a Better Search Query
Many factors can affect streaming data relevancy. When the data you consume isn’t ordered by relevancy, rather by the time...