An Overview of the Webz.io Duo of Crawlers

An Overview of the Webz.io Duo of Crawlers

We get it, content creators. You pour your heart and soul into crafting insightful articles about brands, investment opportunities, and more. But a nagging worry persists – is your content reaching the right people? How will they find it? What if I get lost in the endless scroll of Google results?

We hear you loud and clear. Here’s the good news: not all bots are created equal. We’re introducing a game-changer – the webzio bot duo. These bots are designed to be your silent partners, working behind the scenes to bridge the gap between your content and your ideal audience.

What are the webzio and webzio-extended bots?

Forget the days of content creation without a clear direction. Here’s where the webz-bot duo comes in:

    • Webzio: Webzio is for the hundreds of internal search engines driving traffic to your website, crawling your data to ensure that the traffic coming into your site is highly relevant. Internal search engines include softwares like: Sprinklr, Signal, Mention, Brandwatch, and Recorded Future.

    • Webzio-extended: This bot takes things a step further. It analyzes your content to see if you’ve indicated it’s forbidden for AI usage, letting big data applications know if the data is allowed for AI purposes. This indicator will be clearly marked on the data itself and reflected in our Terms of Service. 

Sprinklr, Signal, Recorded Future, Mention and Brandwatch use webz-bot and webz-bot-extended

How does the webzio bot access your site?

Just like Google connects users with relevant information, Webz.io’s bot duo helps software tools discover your valuable content. Our bots always respect robots.txt exclusions, and you can easily indicate which data is off-limits for specific uses. The following table provides a list of the bot user-agents for easy identification in your referrer logs and robots.txt configuration.

Webz.io Crawlers
User agent token
Purpose
Webzio webzio (+https://webz.io/bot.html) This User Agent is utilized by hundreds of search engines developed for social listening and intelligence platforms.
Webz Crawler for AI/ML Training Data webzio-extended (+https://webz.io/bot.html ) This User Agent is dedicated to determining if the data collected is permissible for AI use cases.
Chart showing the different webz.io crawlers, and their purposes.

To access the complete technical documentation please click here

How to block crawler bots using Robots.txt

We understand that you may have concerns about other bots crawling your content for generic AI training. While both webzio bots are designed for ethical data collection, you still have the option to block any bots you don’t want indexing your content. 

WARNING: Disallowing the webzio bot will stop traffic to your site altogether from internal search engines leveraging webz.io. If you wish to prevent only AI usage, you should only block Webzio-Extended 

Here’s a basic example of a robots.txt file that disallows the webzio bot duo from crawling your website.

#Disallow Webz altogether
User-agent: Webzio
Disallow: /

#Disallow AI usage
User-agent: Webzio-extended
Disallow: /

Targeted traffic, real results

By providing structured data to these platforms, webz-bot opens doors to a world of possibilities that give you the ability to:

    • Target Brand Decision-Makers: Imagine your insightful brand analysis being surfaced in the dashboards of brand monitoring platforms used by marketing executives and PR specialists. Webzio helps ensure your content is seen by the people who can truly benefit from it, potentially sparking collaborations or brand partnerships.

    • Attract Savvy Investors: Wouldn’t it be great if potential investors who perfectly align with the opportunities you highlight could discover your work? Webzio can help! The data it gathers can be used by software tools that analyze investment trends, potentially placing your content directly in front of the right investors at the right time.

    • Become an Industry Authority: Webzio can also be a valuable tool for content creators. By analyzing data trends and user behavior gleaned from where your content is mentioned within these platforms, you can gain valuable insights into your audience’s interests. This allows you to refine your content strategy, establish yourself as an authority in your niche, and attract even more high-value readers.

This is just a sample of the types of softwares we work with at Webz. There are a lot more that will drive highly relevant traffic to your content when you allow the webzio bot to crawl your site.

Transparency is our policy

You have complete control over your content. You can easily block our bots if you choose. But consider the potential – a targeted influx of users with a genuine interest in the specific value your content offers. This translates to significant boosts in engagement, thought leadership, and potentially even new clients or collaborations.

Don’t block the flow of opportunity

Don’t let your content languish in the digital wilderness. Just like Google connects users with information, webzio and webzio-extended tirelessly work behind the scenes, driving qualified traffic specifically to your valuable content. This targeted approach bridges the gap between your content and your ideal audience within brand monitoring platforms, investment research tools, and other relevant software.

Choose who your content empowers. Choose responsible crawlers.

To read the complete webzio bot duo technical documentation please click here. 

Subscribe to our newsletter for more news and updates!

Ready to Explore Web Data at Scale?

Speak with a data expert to learn more about Webz.io’s solutions
Create your API account and get instant access to millions of web sources