Should you use Portable or GCP Dataflow for ELT (Extract, Load, Transform)?
We've outlined the key considerations for comparing the two solutions below. For a more in depth framework, check out our blog post on How to Choose the Right ELT tool.
Business applications vs. event-based data sources. Portable helps teams automatically sync data from their business applications and Software as a Service (SaaS) tools into their warehouse for analytics. Dataflow connects to event data from publish/subscribe (pub/sub) messaging systems, transforms the data, and loads the results into warehouses. Both workflows are critical to a holistic data strategy. If you need to load event-level data from your website, mobile app, or Internet of Things (IOT) devices to your warehouse, go with Dataflow. However, if you need to connect to business applications to build dashboards with automated Key Performance Indicators (KPIs), that is where Portable shines.
Simple, no-code setup. Most analytics teams don't want to spend their time reading documentation, creating pub/sub topics, or navigating command line tools. They just want an easy way to access business critical information in their data warehouse. With Portable, syncing data from business applications is as simple as authenticating with your source, configuring your destination, and setting a workflow live. You can start querying data and creating value within minutes.
Ready to query schemas. With Dataflow, analytics teams have to create datasets, tables, and models that are prepared and ready to receive data from Dataflow. When you spend enough time in analytics, you understand how painful this can be. On the other hand, Portable will connect to your warehouse, create or connect to a dataset, and deliver normalized, ready to query data without any work from you. Portable handles data types, field naming conventions, and schema changes, so you can focus on creating business value.
Cloud agnostic strategy. Data teams need multi-cloud flexibility. You shouldn't be tied to using BigQuery for every project, forever. GCP Dataflow is built specifically to connect to managed GCP products (i.e. pub/sub, BigQuery, etc.). This can tie you to a particular ecosystem, instead of offering the flexibility and extensibility of the open data ecosystem. Portable is cloud agnostic, supports analytics environments across cloud providers, and offers a rapidly expanding catalog of connectors syncing data from third party SaaS tools to your data warehouse of choice. If you need to run multiple analytics environments in parallel, or migrate from one to another, it's simple. We believe connectors should be decoupled from analytics environments.
Want to learn more? Book time for a discussion or a demo directly on my calendar
Enjoy our posts? Subscribe to our newsletter to receive content directly in your inbox.