ETL functions are best done by database programmers with expertise in development, rather than business analysts with no IT experience. If Citizen Integration is not the answer, what can companies do to get the most out of their ETL functions while building and maintaining a data warehouse that truly serves as an engine for revenue growth?
Instead of solutions that drive a wedge between IT and other departments, consider solutions that provide flexibility for data integration transform functions while also being robust enough to handle the most complex data table joins and query logic, using an interface that both the programmer and the business analyst will understand.
SQL has been the language of databases for decades. There are even SQL drivers for API-driven and NoSQL data sources. A product solution that lets the database do most of the heavy lifting when it comes to string, date and numeric transforms will be easy to learn and extensible. Because most IT departments are already so familiar with SQL, teaching them how to use a SQL-based user interface will be easier. And let us not forget that SQL was originally designed for ease of use by business analysts, so the end solution should be understood by the business analyst, even though it was written by a programmer.
Understanding Your Data
Many business analysts assume they have a good understanding of their company data, but often underestimate the range of values that may exist. That invariably means a lot more data mapping that they can realistically execute themselves.
On the other hand, data warehouse programmers with SQL skills can easily run the queries needed to determine what data is available, how the data is structured, and how to cleanse and transform it.
Business analyst and IT developers should each get to leverage their unique skill sets. The business analyst provides the spec because they’re the ones who know what they want to accomplish, while the database programmer figures out the best way to implement that spec and performance tune the queries.
ETL Made Simple
Modern business databases can easily contain tables featuring hundreds of fields that need to be continually accessed and managed. Adding to that complexity is the fact that many of those databases are stored and replicated both in the Cloud and on local servers.
Integration processes will often go sideways when environmental issues arise, such as the loss of an Internet connection or a database going down. When things do go wrong, companies need to quickly figure out what records have been updated and which have not.
With a SQL-based ETL tool, anyone familiar with native SQL can quickly address and solve these problems, and well as host of additional challenges that drag-and-drop ETL products make more complex. Obfuscating what’s really going on in the database with a diagrammatic representation makes diagnosis and performance tuning very much more complicated.
An ETL product must also handle high volumes of data while also being very transparent. That gives the programmer hands-on control of table joins so that if the joins take too long, for example, the reason for that slowdown can be identified and resolved quickly through the use of SQL hints or reordering joins.
A SQL-based ETL product can also prevent problems by wrapping all incoming and outgoing data with the proper functions so that the right transform can be done. The developer can quickly integrate input and output data by mapping source columns so that they correspond to the output table’s columns. And because it’s all written in SQL, the programmers can get instant validation that what they’re doing is working correctly. A read-only graphical interface to verify the data mappings can be very useful, but using the diagram to generate SQL will end up limiting the development effort by constraining it to the limitations of the user interface.
SQL is the native language of relational databases, and is being adapted to Big Data platforms as well with third party drivers. Assuming there is a driver to support your data platform and your coding language, using a common SQL interface will give you a single, familiar programmatic interface. Native SQL drivers for relational databases give you maximum flexibility, power, and performance. Integration products that allow you to code, tweak, or at least see the generated SQL will enable you to debug and tune your integrations.