Saturday, August 10, 2024

CRUD Operations

I made CRUD Operations on the database to proof that it works correctly:

The most important Query this database was made for is getting employee history:

Q6: Write a Query that returns current and past jobs (include employee name, job title, department, manager name, start and end date for position) for employee Toni Lembeck.




Q5: How many employees are in each department?




Q1: Return a list of employees with Job Titles and Department Names:





Q2: Insert Web Programmer as a new job title:




ETL with PostgreSQL

I made data ingestion with ETL. I passed the data from staging table into its corresponding tables normalized to 3NF.

This is the Excel unnormalized table:


And this is the Physical ERD normalized to 3NF:




I started with salary table, I inserted unique values with the select distinct code:


Education table is the same.

Location table I needed to group by Address, Location Name, City and State, since these are company locations:


With Employee I grouped by EMP_ID, Employee Name and Email, this removes data redundancy for employees in the original table and keeps data integrity:


For Employee History I needed to join all the other tables with staging table so I could add the foreign keys, 

With a join all the asignations are ordered based on the on statement so I put the staging table first so that all the data could be visible.

I made a Left Join with managers because not all employees have managers, so I need to insert empties in that space:


And that's it.

Thursday, August 8, 2024

DDL in PostgreSQL

I made the DDL for the HR Database using the Physical ERD normalized to 3NF.

I used PostgreSQL.


I created the employee_history table with the CREATE TABLE command, I added primary key and all the atrributes:


Then I added constraints to the table for adding the foreing keys with ALTER TABLE:



Database Normalization

I normalized a Database to the 3rth Normal Form. 

I was given a dataset in an Excel spreadsheet, that was unnormalized:

For First Normal Form I made sure there were no repetitive groupings in the data.

and no repetitive data within each cell.


For Second Normal Form I made sure that All columns depend on the Unique ID.


For Third Normal Form I made that there were no transitive dependencies in the data:


The conceptual ERD Looks like this:



For normalization I needed to remove repetitions in the data, so the most important part is that the employees have data history in the information I received, so one important Query that I needed to be made is:

  • Write a Query that returns current and past jobs (including employee name, job title, department, manager name, start and end date for position) for employee any employee,

So i needed an employee_history table that could keep the information I needed for this query:




The logical ERD keeps a table for employee, normalizing its data, there is another for company locations, Education and Salaries.

This design achieves the query that I want and also keeps data integrity in the employee data.


The manager can also be grabbed from the employee table as a second connection to employee_history.

The data types for this Physical ERD are for PostgreSQL.