Data has emerged as one of the world’s greatest resources, powering everything from video recommendation engines and digital banking to the burgeoning AI revolution. But in a world where data is increasingly distributed from databases to data warehouses, data lakes, and other locations, combining all data into a compatible format for use in real-time scenarios is It can be a huge amount of work.
Contextually, applications that do not require immediate real-time data access can simply batch data together at fixed intervals. This so-called “batch data processing” is useful, for example, for processing monthly sales data.However, in many cases the company intention I need to access the created data in real time. This is critical, for example, for customer support software that relies on up-to-date information on all sales. Elsewhere, ride-hailing apps also need to process all sorts of data points to connect riders and drivers. This is not something you can wait a few days for. This kind of scenario requires something called “stream data processing” that collects and combines data for real-time access. This is much more complicated to configure.
And this is dozer seeks to do so by providing fast read-only APIs directly from any source via a plug-and-play data infrastructure backend.
handwork dozer Vivek Gudapuri and Matteo Peratiestablished its Singapore-based company almost a year ago. The two of them are building his 10-person distributed team across Asia and Eastern Europe and are preparing to expand beyond their current offering. available sources It is an incarnation (that is, not fully open source) and a fully monetizable product.
Dozer has tested their products with a handful of private design partners, but today they’re out of stealth with developer access. The company also revealed that it has raised his $3 million in seed funding. Sequoia Capital IndiaGoogle’s gradient ventures, undulationand capital of january.
Dozer co-founders Matteo Pelati and Vivek Gudapuri image credit: Dozer
dispersion
streaming database, Apache Flink, Airbyte and Fivetran; caching layers for temporary data storage such as Redis; And transfer data between systems using instant APIs powered by Hasura, Supabase, and more.
Dozer works across all of these different categories, taking what it deems the best and removing the friction associated with building the infrastructure and plumbing that underpins real-time data apps.
Users connect Dozer to their existing data stacks, including databases, data warehouses, and data lakes. Dozer handles real-time data extraction, caching, indexing, and presents data via low-latency APIs. So while the likes of Airbyte and Fivetran help get data into data warehouses, Dozer focuses on the other side. It’s about “making this data accessible in the most efficient way.” Gudapuri explained to his TechCrunch.
Gudapuri said Dozer takes a “dogmatic approach” and only addresses very specific issues and nothing more. For example, existing streaming databases solve many problems far beyond what his Dozer offers: providing real-time data updates and APIs in a single product.
“We solve the right amount of issues in each of these categories to give developers a quick build experience and out-of-the-box performance,” said Gudapuri. “Developers (currently) have to integrate several tools to achieve the same thing.”
As an example, your existing streaming database probably has a query engine, data exploration, OLAP (Online analytical processing), and so on. Dozer deliberately does not provide these capabilities, focusing on what Pelati calls “precomputed views” using his SQL, Python, and JavaScript, all with low latency. You can access it. gRPC and holiday APIs.
So, according to Pelati, Dozer can promise improvements in data query latency.
“With these design choices, Dozer provides much better query latencies that customer-facing applications require,” said Pelati. “A single developer can spin up an entire data app in minutes, but this requires work that would normally take months. and save money.”
(not perfect) open source factors
Dozer is touted as an “open source” platform, GitHub License reveal that you are using Elastic License 2.0 (ELv2), the exact same license enterprise search company Elastic Adopted 2 years ago as part of that transition over there From true open source.In fact, the Elastic license is not recognized as open sourceto prevent third parties from acquiring the Software and offering it as a hosted or managed service.
More precisely, ELv2 can be called a “source available” license. This effectively means offering many of the benefits of a more permissive open source license. MIT, etc., codebase transparency, the ability to extend Dozer’s functionality, or tweak features and fix bugs. That alone should be enough to win the hearts and minds of companies of all sizes. Unless AWS or any other cloud he giant is trying to monetize directly with his Dozer.
However, the company has said it intends to switch to dual licensing “soon” and all of its coredozer projects MIT– License except for “One Core Module”. Additionally, the company emphasizes that all client libraries are already MIT licensed. python, reactand JavaScript.
It’s worth noting that some companies have created their own internal tools to solve problems similar to those addressed by Dozer. Netflix made a bulldozer years ago. In particular, he is one of the main creators behind Bulldozer, Ioannis papapanagitounow working as an advisor to Dozer.
Dozer is still in its early stages, but with $3 million in bank funding from a number of prominent backers, the company has a good deal of cash to go commercial. Add-on feature. Gudapuri said it will be published in the next few months.
“The hosted service handles autoscaling, instant deployment, security, compliance, rate limiting, and a few additional features,” said Gudapuri.