Choose what to extract
Data extractions of entire webpages or specific portions of source code. Quickly send data to Batch Data Collector by right-clicking your mouse
DOM and/or Text Recipe Creation
Define specific fields to extract indicating the text and/or source code that contains the information you’re interested in. Or, for better performance, indicate the DOM object ,or “node”, or the HTML element with class name (for multiple elements) or ID (for singular elements).
Types of Data to Acquire
Save Recipes, Safe Recipes
Premium subscribers can save an infinite number of recipes within Batch Data Collector. The only information we save about you is your preferences. All transmissions are guaranteed by SSL encryption.
Auto-save during Batch Sourcing
BdC performs automated and constant temporary local backups of your extracted data, making use of the indexedDB transactional database, and allows you to recover them in the event of a browser crash. If BdC detects an interrupted navigation session, it will prompt you to review the backups upon restart.
Create custom pauses between opening pages, after a certain number of results are extracted, or after a disconnection event and/or absence of desirable results.
Intentionally suppress pages that aren’t responding to keep your navigation campaign moving along without impeding your progress due to a fussy page or bad URL.
Simple one-click extraction of all links on a webpage.
Recipes allow you to define what information you want to source from a webpage. You can add a column for each piece of data to extract. The name you give to each column (or label) will become the final column name in your file upon export. You can save your custom recipes (Premium only) and reuse them as you like.
Each piece of extracted data can immediately be cleaned of unwanted white space or source code. This can greatly help reduce the time between “file extracted” and “file ready for use.”
Robust Recipe Functionality
BdC recipes contain customizable cycles of actions, or “loops”, allowing incredibly complex sequences of events to be performed before and after data extraction. You can define elements to click, collect links to pages not visited, move the viewport (scroll), activate and deactivate images and scripts -- temporarily or entirely -- for faster navigation or to trigger variable server behavior, and even drag-and-drop your elements to modify recipes on-the-fly.
In Batch We Trust!
The Batch area of BdC allows you to begin combining your custom recipes, creating powerful multi-recipe programs in a single navigation campaign. You can also autosurf lengthy URL lists, specify a “Discard List,” and retrieve a final list of pages that yielded no results.
Choose the number of tabs to simultaneously open, currently capped at 10 pages. If you have a sizable navigation campaign, this could greatly reduce wait times and accelerate productivity, allowing future pages to open while the current pages are extracting. We also support tab suppression in instances where new tabs are “lighter” and more efficient than simply reusing the same tab.
Throttle Your Processor
BdC allows you to get the most out of your workstation by indicating the level of processor effort you’d like to use. A lighter work mode, for example, will allow you to continue working while BdC continues to source data in the background.
User Agents & Other Configurations
While batching, BdC can rotate your User Agent settings after each page visited. It can also block iframe elements, object elements, and websocket connections.
Simple one-click extraction of all emails on a webpage.
Source & Redirect URLs
In addition to sourcing data within a webpage’s content, our recipes also allow you to extract things like Source URL (#sourceURL# parameter), as well as gathering one or more destination URLs resulting from hidden links and their underlying redirects. This is incredibly powerful, and something no one else is doing.
Define key data validation parameters, such as rejecting fields with undesirable information, such as special characters, or only accepting information if it matches your specific criteria.
Auto-detect Hidden Events
BdC can search among the hidden events in underlying web page source code and can recommend the most frequent buttons, both visible and invisible. This can help you both activate or suppress key events in your custom recipe.
Native Data Export
The data you collect will never be sent to our servers. We’ve included several coding technologies for XLSx and CSV files, some specifically modified to make data extractions more efficient. Prior to extract you can also review the information you’ve collected in Preview Mode, in HTML (table or code), in CSV code and in JSON.
Sounds and Alerts
BdC allows you to customize sounds for key events such as completing a list, scan rate, or general warnings.
Send Data to an External Server
In addition to local data export, BdC can send real-time data to an external server in JSON format while executing a navigation campaign. Each recordset is defined by the column headings in your recipe. You can also specify additional parameters to be sent with each group of items, such as a token, detect each server response (successful, unsuccessful), and program any eventual number of additional dispatch attempts to be made.
More to come..
We’re not done developing BdC! Visit our published Roadmap to see what’s in store, and reach out to us if you have any questions or great ideas.