databases
The Entity-Attribute-Value (EAV) model is both Magento's greatest strength and its most persistent performance bottleneck. It gives you unlimited flexibility to add product attributes without schema changes, but that flexibility comes at a cost — every product load can spawn dozens of JOIN operations across multiple tables. If your category pages load slowly, your product detail pages feel sluggi…
Germany has no Companies House Unlike the UK's free official API, German company data is fragmented across regional courts and published through the Handelsregister and Unternehmensregister — built for humans, not machines. Developers face session-bound forms, strict rate limits, and messy HTML instead of structured data. The 2024 eGbR update that broke legacy pipelines In early 2024 Germany intr…
PostgreSQL Error 22P02: invalid text representation PostgreSQL error 22P02 invalid_text_representation occurs when you try to convert a string value into a specific data type, but the string's format is incompatible with that type. This is one of the most common errors in production environments, typically surfacing during user input processing, ETL pipelines, or API integrations where data types…

When working with ClickHouse®, writing a query is usually straightforward. Writing an efficient query, however, requires understanding how ClickHouse reads and filters data. Many users assume that adding a simple WHERE clause automatically results in fast query performance. While filtering is certainly important, not all filters are equally effective. The difference between a query that scans mil…

If you're building a modern data stack that requires either high-throughput transaction processing or large-scale analytical workloads, you've likely come across both Vertica and VoltDB (now rebranded as Volt Active Data). While both are distributed relational database management systems (RDBMS), they are architected for completely opposite use cases — choosing the wrong one can lead to 10x highe…
Recently, I completed my first full Data Engineering project: building an end-to-end ETL pipeline using real-world Australian weather data spanning 10 years. The dataset contained over 145,000 rows, and the goal of the project was to understand how modern data systems ingest, process, validate, and orchestrate data workflows. Rather than focusing only on completing the project quickly, I wanted t…

When working with analytical databases, enriching fact tables with customer information, product metadata, geographical mappings, or business attributes is a common requirement. Traditionally, this enrichment is performed using SQL JOINs. While JOINs work well, they can become increasingly expensive as data volumes grow. Every query must repeatedly scan and match rows, adding latency and consumin…
One thing that confused me when I first started learning ClickHouse was the word FINAL . Because eventually you'll come across both: SELECT * FROM events FINAL ; and: OPTIMIZE TABLE events FINAL ; At first glance, they sound like they should do roughly the same thing. After all, both contain the word FINAL . But they actually solve two completely different problems. One affects query results. The…
目錄 為什麼要理解這件事 電腦儲存層級:從 Register 到 SSD 資料庫引擎核心架構:Page Cache、WAL、Checkpoint 實際範例 總結 1. 為什麼要理解這件事 不管用哪個資料庫、哪種語言,你每天都在做類似的事 # Python + psycopg2 (PostgreSQL) cur . execute ( " INSERT INTO roles (name) VALUES (%s) " , ( " admin " ,)) conn . commit () // C# + EF Core _db . Roles . Add ( role ); await _db . SaveChangesAsync (); await tx . CommitAsync (); // Node.js + pg (PostgreSQL) await client . query (…
DuckDB Data Inlining, SQLite Fossildelta OOB, Postgres 19 Temporal Data Today's Highlights Today's highlights include DuckDB's innovative data inlining for stream processing in data lakes, offering significant performance gains by eliminating the small files problem. Additionally, a critical out-of-bounds read vulnerability in SQLite's fossildelta extension and a peek into PostgreSQL 19's focus o…
Looking Forward to Postgres 19: It's About Time Recently, a new type of question has entered the database arena: what did this data look like last Tuesday? Maybe it's the price of a product before the holiday sale kicked in, or which department an employee belonged to before that reorg nobody asked for. Short of adding an entire audit trigger system, how can we know what data looked like before a…
When people first start learning ClickHouse®, they usually focus on SQL queries, table engines, and performance optimization. But as data grows from millions to billions of rows, architecture becomes the real game changer. Today, I explored three core concepts that make ClickHouse scalable and reliable: Nodes, Shards, and Replicas. Let's understand them with a simple example. Imagine an E-Commerc…

I often see beginners struggle with databases because most resources either jump straight into SQL syntax or spend too much time on theory. Over the past several months, I wrote a beginner-focused book called Database Management Systems for Beginners: Learn SQL, Tables, Queries, and Data Design. The goal was simple: explain databases the way I wish they had been explained to me when I started. Th…
PostgreSQL Error 22003: Numeric Value Out of Range PostgreSQL error code 22003 ( numeric_value_out_of_range ) is raised when you attempt to store or compute a value that exceeds the boundaries of a numeric data type. This can happen during a simple INSERT , an UPDATE , or even a complex arithmetic operation inside a query. It is one of the most common data integrity errors in production environme…
Something is slow. Maybe a page takes forever to load, maybe a migration is hanging, maybe your Supabase dashboard just spins. You suspect a query is stuck somewhere in your database, but you can't see what's happening — Postgres doesn't exactly surface this on its own. Turns out it does. You just need to ask. Seeing what's running Postgres keeps track of every active connection and what it's doi…
The only scalable delete in Postgres is DROP TABLE Tom Pang | Counterintuitively, large DELETEs add work to the database. From experience we can plainly claim the following: the most scalable Postgres data-deletion strategies revolve around deleting entire tables. Individual row DELETE is fine at a small scale. However, big batch DELETE operations don't immediately free up physical disk space, ad…
PostgreSQL Error 2200G: Most Specific Type Mismatch PostgreSQL error code 2200G ( most_specific_type_mismatch ) is a SQL-standard data exception that occurs when a value's type does not match the most specific (most derived) type expected in a context involving type hierarchies, XML schema types, or user-defined structured types. It most commonly appears when working with composite types, domain …
Securing PostgreSQL, in the order an attacker would try things Dan Draper Dan Draper Dan Draper Follow for CipherStash Jun 10 Securing PostgreSQL, in the order an attacker would try things # postgres # security # database # tutorial 1 reaction Add Comment 7 min read
Databases are core to most software systems, and their design directly influences both scalability and performance. Here’s what every engineer should know: Key Roles of Databases in Scalability Horizontal Scaling: Distributed databases can be split across multiple servers (sharding) to handle large datasets and more users. Example: A social media app splits user data among different servers by ge…
research.ioSign up to keep scrolling
Create your feed subscriptions, save articles, keep scrolling.














