Databases (Work in Progress)

I had to dig really deep into general knowledge of databases recently. Aside from general definitions, there are concepts and notes of interesting things such as graph databases.

Schema - construct used to organize database objects in a manner similar to namespaces in programming languages. A schema contains all the database named objects, including tables, views, functions, aggregates, indexes, sequences, triggers, data types, domains, and ranges.

Database - a collection or repository of data. Mostly managed by a database management system. Tabular, semi structured, or unstructured data.

Quick Commands

``` CREATE TABLE customer (
customer_id SERIAL );


Data Types

There are Numeric, Character, Date, and Time data types in PostgreSQL.

Numeric Types

Smallint - 2 bytes, SQL equivalent Int2, range -32768 to 32767.

Int - 4 bytes, SQL equivalent Int4 or int, range is -2147483648 to +2147483647.

Bigint - 8 bytes, SQL equivalent Int8, range is 9223372036854775808 to +9223372036854775807.

Numeric or decimal - variable size, up to 131072 digits before the decimal point; up to 16383 digits after the decimal point.

real - 4 bytes, has a special value of infinity, Platform dependent, at least 6 digit precision. Often the range is 1E-37 to 1E+37.

Double precision - 8 bytes, has special value of Infinity and NaN, Platform dependent, at least 15 digit precision, often the range is 1E-307 to 1E+308.

Serial types (smallserial, serial, bigserial) are wrappers on top of smallint, int, and biginteger. They are used as surrogate keys and aren't allowed to have a null value. It also uses sequences behind the scenes. A sequence is a database object that is used to generate sequences by specifying the minimum and maximum values, including their increment values.



- Atomicity - all or nothing transactions; no partial failures or successes. - Consistency - any transactions gets the db from one valid state to another. - Isolation - Concurrent execution of transactions results in the same state if transactions ran serially. - Durability - Committed(WAL)

- Schema Free, simplicity of design, horizontal scaling and easy replication + large amount of data support.

Relational Database - A database structured to recognize relations among stored items of information. Wikipedia

Object-Relational Database -

NoSQL Database - data storage, manipulation and retrieval for non-relational data. Distributed, open source, horizontally scalable. BASE model AKA availability over consistency.

CAP Theorem - Concept of thinking that it's impossible for a distributing computing system to simultaneously provide all three of the following guarantees:

  • Consistency - All clients immediately see the latest data, including updates.

  • Availability - All clients can find a replica of some data even in the case of node failure. If part of the system goes down, the clients can still access the data.

  • Partition Tolerance - The system continues to work regardless of message loss or partial system failure.

Joins - a clause used to combine records from two or more tables in a database. A joins is a means for combining fields from two tables by using values common to each. DOCUMENTATION FROM POSTGRESQL

Inner Join - SELECT EMP_ID, NAME, DEPT FROM COMPANY INNER JOIN DEPARTMENT ON COMPANY.ID = DEPARTMENT.EMP_ID. Each row of table 1 is compared with each row of table 2. If the join predicate is satisfied, column values for each matched pair of rows for table 1 and 2 are combined into a result row.

Consistency Model Explanation by Neo4j

PostgreSQL vs MongoDB

PostgreSQL vs MySQL