Feature List

Choose what to extract

Data extractions of entire webpages or specific portions of source code. Quickly send data to Batch Data Collector by right-clicking your mouse

DOM and/or Text Recipe Creation

Define specific fields to extract indicating the text and/or source code that contains the information you’re interested in. Or, for better performance, indicate the DOM object ,or “node”, or the HTML element with class name (for multiple elements) or ID (for singular elements).

Types of Data to Acquire

BdC is incredibly versatile, allowing you to search among each HTML tag attribute (standard and non-standard), even delving into the Javascript code itself, to then make text substitutions and/or filter key information. You can even blend data extraction methods between source code and HTML.

Premium only

Save Recipes, Safe Recipes

Premium subscribers can save an infinite number of recipes within Batch Data Collector. The only information we save about you is your preferences. All transmissions are guaranteed by SSL encryption.

Premium only

Auto-save during Batch Sourcing

BdC performs automated and constant temporary local backups of your extracted data, making use of the indexedDB transactional database, and allows you to recover them in the event of a browser crash. If BdC detects an interrupted navigation session, it will prompt you to review the backups upon restart.

Premium only

Navigation Pauses

Create custom pauses between opening pages, after a certain number of results are extracted, or after a disconnection event and/or absence of desirable results.

Premium only

Forced Timeout

Intentionally suppress pages that aren’t responding to keep your navigation campaign moving along without impeding your progress due to a fussy page or bad URL.

Premium only

Reduce Noise

To speed up data capture, BdC allows you to filter images, disable cascading style sheets, block multimedia objects and Javascript, or even skip downloading results locally (for local export) in the event that you only intend to send results directly to your external server.

Grab links

Simple one-click extraction of all links on a webpage.

Custom Recipes

Recipes allow you to define what information you want to source from a webpage. You can add a column for each piece of data to extract. The name you give to each column (or label) will become the final column name in your file upon export. You can save your custom recipes (Premium only) and reuse them as you like.

Data Trimming

Each piece of extracted data can immediately be cleaned of unwanted white space or source code. This can greatly help reduce the time between “file extracted” and “file ready for use.”

Robust Recipe Functionality

BdC recipes contain customizable cycles of actions, or “loops”, allowing incredibly complex sequences of events to be performed before and after data extraction. You can define elements to click, collect links to pages not visited, move the viewport (scroll), activate and deactivate images and scripts -- temporarily or entirely -- for faster navigation or to trigger variable server behavior, and even drag-and-drop your elements to modify recipes on-the-fly.

Premium only

In Batch We Trust!

The Batch area of BdC allows you to begin combining your custom recipes, creating powerful multi-recipe programs in a single navigation campaign. You can also autosurf lengthy URL lists, specify a “Discard List,” and retrieve a final list of pages that yielded no results.

Premium only

Multi-tab Browsing

Choose the number of tabs to simultaneously open, currently capped at 10 pages. If you have a sizable navigation campaign, this could greatly reduce wait times and accelerate productivity, allowing future pages to open while the current pages are extracting. We also support tab suppression in instances where new tabs are “lighter” and more efficient than simply reusing the same tab.

Premium only

Throttle Your Processor

BdC allows you to get the most out of your workstation by indicating the level of processor effort you’d like to use. A lighter work mode, for example, will allow you to continue working while BdC continues to source data in the background.

Premium only

User Agents & Other Configurations

While batching, BdC can rotate your User Agent settings after each page visited. It can also block iframe elements, object elements, and websocket connections.

Grab email

Simple one-click extraction of all emails on a webpage.

Source & Redirect URLs

In addition to sourcing data within a webpage’s content, our recipes also allow you to extract things like Source URL (#sourceURL# parameter), as well as gathering one or more destination URLs resulting from hidden links and their underlying redirects. This is incredibly powerful, and something no one else is doing.

Data Exclusion

Define key data validation parameters, such as rejecting fields with undesirable information, such as special characters, or only accepting information if it matches your specific criteria.

Premium only

Auto-detect Hidden Events

BdC can search among the hidden events in underlying web page source code and can recommend the most frequent buttons, both visible and invisible. This can help you both activate or suppress key events in your custom recipe.

Native Data Export

The data you collect will never be sent to our servers. We’ve included several coding technologies for XLSx and CSV files, some specifically modified to make data extractions more efficient. Prior to extract you can also review the information you’ve collected in Preview Mode, in HTML (table or code), in CSV code and in JSON.

Sounds and Alerts

BdC allows you to customize sounds for key events such as completing a list, scan rate, or general warnings.

Premium only

Send Data to an External Server

In addition to local data export, BdC can send real-time data to an external server in JSON format while executing a navigation campaign. Each recordset is defined by the column headings in your recipe. You can also specify additional parameters to be sent with each group of items, such as a token, detect each server response (successful, unsuccessful), and program any eventual number of additional dispatch attempts to be made.

More to come..

We’re not done developing BdC! Visit our published Roadmap to see what’s in store, and reach out to us if you have any questions or great ideas.

