Blogs

The Knowledge Dividend of LLMs: a pragmatic perspective 

Learn what LLMs like GPT-4 know, how they know it and what this means for users with this pragmatic look at knowledge in large language models.

How a Fortune 10 Company Builds Great Tableau Dashboards Faster Than You

Learn how leading companies leverage design systems to help users easily create consistently high-quality Tableau dashboards that improve decision-making.

A No-Nonsense Approach to Large Language Models for the Enterprise pt. 3

See the results of experiments with GPT and other LLMs in enterprise use cases to set realistic expectations and get the most from the technology.

Avoid the Pitfalls of Causality Analysis

Learn the basics of the most popular root cause analysis methods and find out how to apply them to uncover KPI divers and improve decision-making.

A No-Nonsense Approach to Large Language Models for the Enterprise pt. 2

Learn how OpenAI and its open-source competitors fare in terms of performance, price and security in an enterprise context—based on real-world experiments.

Profitable Location Data Monetization — 3 Lessons from a Telco Company

Get best practices for turning your organizational data into a lucrative revenue stream, based on actual project experience at a major telco company.

A No-Nonsense Approach to Large Language Models for the Enterprise pt. 1

Ignore the hype around large language models like ChatGPT and find out from data scientists where the real opportunities lie for the enterprise.

Overcoming Data Science Challenges in Biosensor Analytics

AI and biosensor analytics will change healthcare as we know it  -  if companies can deliver the right solutions. See how this is playing out.

Avoid These 3 Mistakes When Working with a Synapse Dedicated SQL Pool

Learn three often overlooked techniques for optimizing resource utilization and ensuring cost-effectiveness with a Synapse dedicated SQL pool.

Consumer Goods R&D with Automated Product Stability Forecasting

See how a Fortune 50 company used an ML-driven solution to digitalize testing and improve the speed and cost-effectiveness of research and development.

Find the Balance between Cloud Cost and Efficiency

Learn how to measure the ROI of a cloud migration and get clarity on the opportunities and challenges inherent in adopting cloud-native solutions.

From Guesswork to Genius: How to Get Maximum Value from Marketing Data and Automation

See how one company used marketing data to learn more about their audience and create more effective campaigns- - and learn how you can do the same.

Data Clean Rooms Demystified: A New Era in Secure Data Analysis

Get up to speed on data clean rooms and find out how they help businesses promote innovation and collaboration while maintaining security and compliance.

My New Favorite Extensions for VSCode: CodeGPT and Github Actions

Make your life as a data engineer easier and more productive with two new VSCode extensions that grant you easy access to ChatGPT and GitHub.

Open Table Formats for Efficient Data Processing: Delta Lake vs Iceberg vs Hudi

Compare three popular open table formats to see which one will give you better performance based on your needs for data storage, processing and more.

Asemantic Induction of Hallucinations in Large Language Models

See how you can get GPT-4 to hallucinate and what it tells us about how GPT and similar language models arrive at their outputs.

Can We Put a Lie Detector on ChatGPT?

Find out how a ChatGPT-like generative AI solution could be equipped to make it a reliable source of information for business use cases and beyond.

Data for the Next Pandemic

Data has made all the difference in this pandemic. Learn how we need to prepare for the next using modern methods of infectious disease modeling.

Data Fabric vs Data Mesh: Find the Right Fit for Your Organization

Learn the differences between data mesh and data fabric architectures and find out which one's right for your data governance and analytics needs.

MQTT on Steroids: Running EMQX Enterprise on Kubernetes

Learn how to deploy EMQX, a robust MQTT message broker in a Kubernetes cluster with a single command line to leverage its IaC and IIoT capabilities.

IaC 101: Building Reliable Infrastructure with the Power of Code

Learn the basics of infrastructure as code and see tools and techniques that help you get better value and performance from your infrastructure.

Hand-Picked Tools for Building an Open-Source Data Platform

Get open-source alternatives to market-leading tools that will help you build an end-to-end data platform for high performance and cost savings.

Look Out for These Data Engineering Trends in 2023

Learn about the data engineering solutions that will help organizations optimize costs and prepare to tackle the challenges of the coming year.

Data Visualization in 2023 — Seven Trends to Watch 

Learn about the technologies and approaches transforming data visualization, including application integration, accessibility, UI Design and more.

Five Healthcare AI/ML Trends to Watch for in 2023

Learn about the technologies and approaches transforming healthcare, including AI in patient safety, decentralized clinical trials and more.

Ingestions in DBT: How to Load Data from REST APIs with Snowpark

Learn how to transform dbt into a full-fledged ELT solution.

Dagster + Airbyte + dbt: How Software-Defined Assets Change the Way We Orchestrate

See how Dagster changes the game by enabling you to get a simple, singular view of your end-to-end data pipeline across multiple tools.

Extending Airbyte: Creating a Source Connector for Wrike

Find out if Airbyte's claims of outstanding extensibility hold up under scrutiny when building a new connector from scratch for Wrike data.

Introducing TabCSS, Your Shortcut to Styled Backgrounds in Tableau

See how the TabCSS Tableau extension helps you add styled backgrounds to dashboard objects and containers without having to use external tools and images.

Advanced Server-side Filtering for Tables in Appsmith

Learn how you can extend no-code with low-code to enable server-side filtering in Appsmith, the open-source alternative to Retool.

Web3 is Lemons — Go Make Lemonade

Learn about blockchain-based technologies and approaches to formulate an effective strategy that will help you take the web3 transition in stride.

Four Things Data Viz Practitioners Can Do to “Get Better at Design”

Learn tool-agnostic techniques for creating better data visualizations by improving your use of tooltips, space, lines, color and typography.

Fivetran Acquires HVR: You’re in for a Treat

Find out why Fivetran’s acquisition of HVR with is great news for organizations looking to optimize their data pipelines and streamline operations.

As We May See: The World after Dashboards

Dashboards are the past. Self-curating data experiences, leveraging AI to compile and represent data, are the future. Learn what's next after dashboards.

Streamline Authentication with Tableau's Connected Apps

Get a data leader’s crash course on Tableau’s new Connected Apps feature and learn how it streamlines authentication and improves security.

New Elements for Tableau Features for Increased Customizability and Ease of Use

The best-in-class extension for Tableau write-back and dashboard collaboration now makes it even easier for teams to enrich data together.

Log4j Vulnerability in Tableau – How to Fix / Workaround [unofficial]

The log4j vulnerability has impacted a number of services, including Tableau, but with a simple config change, you can disable some of log4j’s problematic behavior. Learn how.

Data Science Trends to Rule 2022

Stay ahead of data science trends and learn how to approach the most promising technologies to drive innovation and gain competitive advantage.

Three Signs Your BI Dashboard Development Process Needs Help – and What to Do about It

Learn how common BI dashboard development issues result from mistakes made during planning and find out how what you can do to prevent them.

New Dashboard Collaboration and Security Features Added to Elements for Tableau

Elements now supports customizable user tagging and notifications, field-level access control for annotations and Okta Authentication.

Track and Understand Tableau Dashboard Usage with a Free and Open-Source Extension

Learn to use Tableau Usage Tracker, a free and open-source extension that enables you to measure and understand dashboard usage.

8 Best Practices for Working with Your Data Science Vendor — from Data Scientists

Get practical advice from Starschema data scientists to optimize your workflows for better productivity and results from your next project.

10 Tips for Tableau Dashboard Collaboration

Learn basic and extension-enabled Tableau techniques to streamline team communication and collaborate more effectively on dashboard content.

The Top 5 Jenkins Plugins You Should Have in 2021

Learn about the five essential Jenkins plugins that will help you boost productivity and deliver better-quality results.

What’s New in Tableau 2021.1: Snowflake Geospatial Support with Map Layers
Use map layers and leverage Snowflake geospatial support in Tableau 2021.1 to create advanced location-based analytics with multiple object types.
Testing SQL Pool Performance in Azure Synapse Analytics

The feature set of Synapse Analytics is considerably richer than that of a “plain” old Azure SQL Database, but what benefits do we get from the smallest dedicated pool?

The COVID Tracking Project is Shutting Down in a Week. What Next?

The COVID Tracking Project has been one of the most successful citizen-driven data collection projects in history. Driven by The Atlantic and supported by an army of volunteers, it has collected the nuggets of information about testing and case counts, often beating federal and state authorities to the race...

Introducing the Starschema Worldwide Address Data Set in Snowflake

We’re happy to announce that, as part of our ongoing effort to democratize data, we’ve taken over as the provider of The Worldwide Address Data Set, a free and open global address collection on the Snowflake Data Marketplace.

Vaccine Tracking Added to The Starschema COVID-19 Epidemiological Dataset

In our effort to help organizations assess contingency plans and make informed, data-driven decisions in real-time as they respond to the global health emergency, we’ve added vaccine tracking from the University of Oxford to the Starschema COVID-19 Epidemiological Dataset.

StarSnow: HTTP Client for Snowflake SQL

Snowflake is an extremely SQL-friendly database: you can ingest, transform, and access your structured and semi-structured data directly from your SQL code. However, as a cloud-only data platform, it has some fundamental restrictions...