Sep 2, 2099

𝄪


𝄪

3 min to read

Discomfort Is Your Best Teacher

Finding comfort through continuous discomfort

We’ve worked with over 50+ organizations building data platforms, pipelines, and analytics strategies. And across every single one — from startups to scaled enterprises — we keep seeing the same 10 mistakes that ruin even the best dashboards.

If you’re:

  • Building your first data team

  • Rolling out modern data infra

  • Already invested in tools like dbt, Snowflake, or Looker

Read this. These mistakes can cost you a fortune in wasted time, bad decisions, and dashboards no one trusts.

1. Skipping the Source Layer (Raw Ingest)

Mistake: Transforming on load or skipping raw ingestion altogether.
Why it’s bad:

  • You lose your audit trail

  • Debugging becomes a nightmare

  • There's no way to trace data back to its origin

What to do instead: Always ingest raw source data untouched — even if it’s messy. You can clean later, but keep the original.

2. Mixing Cleaning and Business Logic Too Early

Mistake: Writing joins, KPI logic, and complex calcs in the same layer where you’re cleaning data.
Why it’s bad:

  • Hard to debug

  • No clear separation of concerns

  • Creates black-box pipelines no one understands

What to do instead:
Clean your data first. Save business logic for later layers — like object models or datamarts.

3. Doing Too Much in One Layer

Mistake: Building dashboards off raw or semi-cleaned data.
Why it’s bad:

  • Fragile pipelines

  • “Spaghetti” SQL

  • No reusability for other teams or use cases

What to do instead:
Respect your pipeline layers. The ideal flow:

Source → Preprocess → Objects → Datamarts → Dashboards

This saves you a ton of pain down the line.

4. Skipping Referential Integrity Checks

Mistake: Not checking for broken joins or duplicate IDs in fact/dim tables.
Why it’s bad:

  • Dashboards break silently

  • KPIs show the wrong numbers

  • You lose trust fast

What to do instead:
Set up dbt tests for things like:

  • not_null

  • unique

  • Foreign key integrity

Especially on any *_id fields. These tests catch issues early before they become expensive.

5. Not Following Naming Conventions

Mistake: Tables like crm, sales, 1221, cm, branding... chaos.
Why it’s bad:

  • New team members get confused

  • Self-serve analytics becomes impossible

  • You end up answering the same questions every week

What to do instead:
Use a clear naming convention like:

  • fact_ for metrics tables (e.g., fact_bookings)

  • dim_ for dimension tables (e.g., dim_customers)

  • Plural names and no cryptic abbreviations

Consistency = clarity.

6. SELECT * Everywhere

Mistake: Using SELECT * in your SQL.
Why it’s bad:

  • Schema changes silently break downstream logic

  • Extra, unused columns slow performance

  • You have no control over the pipeline’s shape

What to do instead:
Explicitly list your columns. Yes, it’s annoying upfront — but it’ll save you hours of debugging later.

7. Overcomplicating Too Early

Mistake: Trying to normalize everything, implement SCD2, or build semantic layers on Day 1.
Why it’s bad:

  • Slows everything down

  • Creates technical debt with no ROI

  • Team gets stuck optimizing for the wrong things

What to do instead:
Start simple. Document your shortcuts. Build complexity when the business actually needs it.

8. Ignoring Business Context

Mistake: Modeling for what’s easiest for BI/engineering — not what makes sense to stakeholders.
Why it’s bad:

  • Your dashboards won’t get used

  • KPIs won’t match the way teams make decisions

  • Your data team becomes a bottleneck

What to do instead:
Talk to the business. Seriously. Interview stakeholders, understand how they think, and model around that logic.

9. No Testing or Documentation

Mistake: No dbt tests, no schema.yml files, no documentation.
Why it’s bad:

  • People stop trusting the data

  • Bugs sneak in

  • New hires take forever to onboard

What to do instead:
Use .yml files for schema + test definitions. Enable dbt docs. Make documentation a default part of your workflow.

10. Letting Business Users Query Raw Data

Mistake: Giving analysts direct access to raw/preprocessed layers.
Why it’s bad:

  • Errors, wrong KPIs, wasted time

  • You get asked to “fix the dashboard” when the data logic was wrong to begin with

What to do instead:
Business users should only use trusted datamarts or semantic layers — never the messy stuff.

Final Takeaway

These mistakes don’t just cause bugs.
They cause team churn, business distrust, and wasted millions.

If you’re building (or rebuilding) your data stack, keep it simple:

  • Respect your layers

  • Document everything

  • Talk to the business

  • Test early and often

We’ve helped dozens of companies get this right — without blowing the budget or rebuilding three times.

📞 Want to Build a Scalable, Trustworthy Data Stack?

At Aztela, we help teams build lean, fast, and reliable data platforms that actually get used.

Whether you're:

  • Starting from scratch

  • Rebuilding a broken stack

  • Struggling with trust in metrics

We’ll give you a custom data architecture roadmap, based on your goals, stack, and team size — for free.

📅 Schedule a Data Strategy Session →
Let’s help you avoid these mistakes before they cost you.

🔍 FAQs

Do I really need a raw ingest layer?
Yes — always keep untouched raw data. You’ll thank yourself when debugging or auditing data issues.

What’s the ideal layer structure for modern data infra?
Source → Preprocess → Object → Datamart → Dashboard. Keep each layer focused and clean.

Can I skip documentation if I’m a solo data engineer?
Nope. Your future self (and your next hire) will depend on it. Even lightweight .yml + dbt docs go a long way.

What if I’ve already made these mistakes?
That’s normal. We’ve cleaned up these problems dozens of times. The fix is usually simpler than you think.

Let me know if you'd like:

  • A downloadable checklist of the 10 mistakes

  • Visual diagrams for each data layer

  • A services page version to turn this into a conversion funnel

  • A LinkedIn version to repurpose for posts or threads

Ready when you are.

Content

FOOTNOTE

This article was generated by AI and should not be considered an original work. It may contain inaccuracies or hallucinated information. Please use it as an example only and replace the content with your writing.