Databases (Work in Progress)
I had to dig really deep into general knowledge of databases recently. Aside from general definitions, there are concepts and notes of interesting things such as graph databases.
Schema - construct used to organize database objects in a manner similar to namespaces in programming languages. A schema contains all the database named objects, including tables, views, functions, aggregates, indexes, sequences, triggers, data types, domains, and ranges.
Database - a collection or repository of data. Mostly managed by a database management system. Tabular, semi structured, or unstructured data.
Quick Commands
```
CREATE TABLE customer (
customer_id SERIAL
);
=
Data Types
There are Numeric, Character, Date, and Time data types in PostgreSQL.
Numeric Types
Smallint - 2 bytes, SQL equivalent Int2, range -32768 to 32767.
Int - 4 bytes, SQL equivalent Int4 or int, range is -2147483648 to +2147483647.
Bigint - 8 bytes, SQL equivalent Int8, range is 9223372036854775808 to +9223372036854775807.
Numeric or decimal - variable size, up to 131072 digits before the decimal point; up to 16383 digits after the decimal point.
real - 4 bytes, has a special value of infinity, Platform dependent, at least 6 digit precision. Often the range is 1E-37 to 1E+37.
Double precision - 8 bytes, has special value of Infinity and NaN, Platform dependent, at least 15 digit precision, often the range is 1E-307 to 1E+308.
Serial types (smallserial, serial, bigserial) are wrappers on top of smallint
, int
, and biginteger
. They are used as surrogate keys and aren't allowed to have a null value. It also uses sequences behind the scenes. A sequence is a database object that is used to generate sequences by specifying the minimum and maximum values, including their increment values.
=
Concepts
ACID:
- Atomicity - all or nothing transactions; no partial failures or successes.
- Consistency - any transactions gets the db from one valid state to another.
- Isolation - Concurrent execution of transactions results in the same state if transactions ran serially.
- Durability - Committed(WAL)
BASE:
- Schema Free, simplicity of design, horizontal scaling and easy replication + large amount of data support.
Relational Database - A database structured to recognize relations among stored items of information. Wikipedia
Object-Relational Database -
NoSQL Database - data storage, manipulation and retrieval for non-relational data. Distributed, open source, horizontally scalable. BASE model AKA availability over consistency.
CAP Theorem - Concept of thinking that it's impossible for a distributing computing system to simultaneously provide all three of the following guarantees:
Consistency - All clients immediately see the latest data, including updates.
Availability - All clients can find a replica of some data even in the case of node failure. If part of the system goes down, the clients can still access the data.
Partition Tolerance - The system continues to work regardless of message loss or partial system failure.
Joins - a clause used to combine records from two or more tables in a database. A joins is a means for combining fields from two tables by using values common to each. DOCUMENTATION FROM POSTGRESQL
Inner Join - SELECT EMP_ID, NAME, DEPT FROM COMPANY INNER JOIN DEPARTMENT ON COMPANY.ID = DEPARTMENT.EMP_ID
. Each row of table 1 is compared with each row of table 2. If the join predicate is satisfied, column values for each matched pair of rows for table 1 and 2 are combined into a result row.