This series of tutorials guides viewers through building a comprehensive analytics API from scratch using Python and FastAPI. It focuses on creating a robust backend service capable of ingesting and storing time-series data efficiently using a PostgreSQL database optimized for this purpose, called TimescaleDB. The process involves setting up a development environment with Docker, defining data models and API endpoints, implementing data validation with Pydantic, and performing CRUD operations using SQLModel. Furthermore, the tutorials cover aggregating time-series data with TimescaleDB’s hyperfunctions and deploying the application to production using Timescale Cloud and Railway. Finally, it explores making the deployed API a private networking service accessible by other applications, like a Jupyter notebook server, for data analysis and visualization.
Analytics API Microservice Study Guide
Quiz
- What is the primary goal of the course as stated by Justin Mitchell? The main objective is to build an analytics API microservice using FastAPI. This service will be designed to receive and analyze web traffic data from other services.
- Why is TimeScaleDB being used in this project, and what is its relationship to PostgreSQL? TimeScaleDB is being used because it is a PostgreSQL extension optimized for time-series data and analytics. It enhances PostgreSQL’s capabilities for bucketing and aggregating data based on time.
- Explain the concept of a private analytics API service as described in the source. A private analytics API service means that the API will not be directly accessible from the public internet. It will only be reachable from internal resources that are specifically granted access.
- What is the significance of the code being open source in this course? The open-source nature of the code allows anyone to access, grab, run, and deploy their own analytics API whenever they choose. This gives users complete control and flexibility over their analytics infrastructure.
- Describe the purpose of setting up a virtual environment for Python projects. Creating a virtual environment isolates Python projects and their dependencies from each other. This prevents conflicts between different project requirements and ensures that the correct Python version and packages are used for each project.
- What is Docker Desktop, and what are its key benefits for this project? Docker Desktop is an application that allows users to build, share, and run containerized applications. Its benefits for this project include creating a consistent development environment that mirrors production, easy setup of databases like TimeScaleDB, and simplified sharing and deployment of the API.
- Explain the function of a Dockerfile in the context of containerizing the FastAPI application. A Dockerfile contains a set of instructions that Docker uses to build a container image. These instructions typically include specifying the base operating system, installing necessary software (like Python and its dependencies), copying the application code, and defining how the application should be run.
- What is Docker Compose, and how does it simplify the management of multiple Docker containers? Docker Compose is a tool for defining and managing multi-container Docker applications. It uses a YAML file to configure all the application’s services (e.g., the FastAPI app and the database) and allows users to start, stop, and manage them together with single commands.
- Describe the purpose of API routing in a RESTful API like the one being built. API routing defines the specific URL paths (endpoints) that the API exposes. It dictates how the API responds to different requests based on the URL and the HTTP method (e.g., GET, POST) used to access those endpoints.
- What is data validation, and how is Pydantic (and later SQLModel) used for this purpose in the FastAPI application? Data validation is the process of ensuring that incoming and outgoing data conforms to expected formats and types. Pydantic (and subsequently SQLModel, which is built on Pydantic) is used to define data schemas, allowing FastAPI to automatically validate data against these schemas and raise errors for invalid data.
Essay Format Questions
- Discuss the benefits of using a microservice architecture for an analytics API, referencing the technologies chosen in this course (FastAPI, TimeScaleDB, Docker).
- Explain the process of setting up a development environment that closely mirrors a production environment, as outlined in the initial sections of the course, and discuss the rationale behind each step.
- Compare and contrast the roles of virtual environments and Docker containers in isolating and managing application dependencies and runtime environments.
- Describe the transition from using Pydantic schemas for data validation to SQLModel for both validation and database interaction, highlighting the advantages of SQLModel in building a data-driven API.
- Explain the concept of Time Series data and how TimeScaleDB’s hypertable feature is designed to efficiently store and query this type of data, including the significance of chunking and retention policies.
Glossary of Key Terms
- API (Application Programming Interface): A set of rules and protocols that allows different software applications to communicate and exchange data with each other.
- Microservice: An architectural style that structures an application as a collection of small, independent services, each focusing on a specific business capability.
- Analytics API: An API specifically designed to provide access to and facilitate the analysis of data.
- Open Source: Software with source code that is freely available and can be used, modified, and distributed by anyone.
- FastAPI: A modern, high-performance web framework for building APIs with Python, based on standard Python type hints.
- TimeScaleDB: An open-source time-series database built as a PostgreSQL extension, optimized for high-volume data ingestion and complex queries over time-series data.
- PostgreSQL: A powerful, open-source relational database management system known for its reliability and extensibility.
- Time Series Data: A sequence of data points indexed in time order.
- Jupyter Notebook: An open-source web-based interactive development environment that allows users to create and share documents containing live code, equations, visualizations, and narrative text.
- Virtual Environment: An isolated Python environment that allows dependencies for a specific project to be installed without affecting other Python projects or the system-wide Python installation.
- Docker: A platform for building, shipping, and running applications in isolated environments called containers.
- Container: A lightweight, standalone, and executable package of software that includes everything needed to run an application: code, runtime, system tools, libraries, and settings.
- Docker Desktop: A user-friendly application for macOS and Windows that enables developers to build and share containerized applications and microservices.
- Dockerfile: A text document that contains all the commands a user could call on the command line to assemble a Docker image.
- Docker Compose: A tool for defining and running multi-container Docker applications. It uses a YAML file to configure the application’s services.
- API Endpoint: A specific URL that an API exposes and through which client applications can access its functionalities.
- REST API (Representational State Transfer API): An architectural style for designing networked applications, relying on stateless communication and standard HTTP methods.
- API Routing: The process of defining how the API responds to client requests to specific endpoints.
- Data Validation: The process of ensuring that data meets certain criteria, such as type, format, and constraints, before it is processed or stored.
- Pydantic: A Python data validation and settings management library that uses Python type hints.
- SQLModel: A Python library for interacting with SQL databases, built on Pydantic and SQLAlchemy, providing data validation, serialization, and database integration.
- SQLAlchemy: A popular Python SQL toolkit and Object Relational Mapper (ORM) that provides a flexible and powerful way to interact with databases.
- Database Engine: A software component that manages databases, allowing users to create, read, update, and delete data. In SQLModel, it refers to the underlying SQLAlchemy engine used to connect to the database.
- SQL Table: A structured collection of data held in a database, consisting of columns (attributes) and rows (records).
- Environment Variables: Dynamic named values that can affect the way running processes will behave on a computer. They are often used to configure application settings, such as database URLs.
- Hypertable: A core concept in TimeScaleDB, representing a virtual continuous table across all time and space, which is internally implemented as multiple “chunk” tables.
- Chunking: The process by which TimeScaleDB automatically partitions hypertable data into smaller, more manageable tables based on time and optionally other partitioning keys.
- Retention Policy: A set of rules that define how long data should be kept before it is automatically deleted or archived.
- Time Bucket: A function in TimeScaleDB that aggregates data into specified time intervals.
- Gap Filling: A technique used in time-series analysis to fill in missing data points in a time series, often by interpolation or using a default value.
- Railway: A cloud platform that simplifies the deployment and management of web applications and databases.
Briefing Document: Analytics API Microservice Development
This briefing document summarizes the main themes and important ideas from the provided source material, which outlines the development of an open-source analytics API microservice using FastAPI, PostgreSQL with the TimescaleDB extension, and various other modern technologies.
Main Themes:
- Building a Private Analytics API: The primary goal is to create a microservice that can ingest and analyze web traffic data from other services within a private network.
- Open Source and Customization: The entire codebase is open source, allowing users to freely access, modify, and deploy their own analytics API.
- Modern Technology Stack: The project utilizes FastAPI (a modern, fast web framework for building APIs), PostgreSQL (a robust relational database), TimescaleDB (a PostgreSQL extension optimized for time-series data and analytics), Docker (for containerization), and other related Python packages.
- Time-Series Data Analysis: A key focus is on leveraging TimescaleDB’s capabilities for efficient storage and aggregation of time-stamped data.
- Production-Ready Deployment: The course aims to guide users through the process of deploying the analytics API into a production-like environment using Docker.
- Local Development Environment Setup: A significant portion of the initial content focuses on setting up a consistent and reproducible local development environment using Python virtual environments and Docker.
- API Design and Routing: The process includes designing API endpoints, implementing routing using FastAPI, and handling different HTTP methods (GET, POST).
- Data Validation with Pydantic and SQLModel: The project utilizes Pydantic for data validation of incoming and outgoing requests and transitions to SQLModel for defining database models, which builds upon Pydantic and SQLAlchemy.
- Database Integration with PostgreSQL and TimescaleDB: The briefing covers setting up a PostgreSQL database (initially standard, then upgraded to TimescaleDB) using Docker Compose and integrating it with the FastAPI application using SQLModel.
- Querying and Aggregation of Time-Series Data: The document introduces the concept of time buckets in TimescaleDB for aggregating data over specific time intervals.
- Cloud Deployment with Railway: The final section demonstrates deploying the analytics API and a Jupyter Notebook server (for accessing the API) to the Railway cloud platform.
Most Important Ideas and Facts:
- Open Source Nature: “thing about all of this is all the code is open source so you can always just grab it and run with it and deploy your own analytics API whenever you want to that’s kind of the point”. This emphasizes the freedom and customizability offered by the project.
- FastAPI for API Development: The course uses FastAPI for its speed, ease of use, and integration with modern Python features. “The point of this course is to create an analytics API microservice we’re going to be using fast API so that we can take in a bunch of web traffic data on our other services and then we’ll be able to analyze them as we see fit.”
- TimescaleDB for Time-Series Optimization: TimescaleDB is crucial for efficiently handling and analyzing time-based data. “we’ll be able to analyze them as we see fit and we’ll be able to do this with a lot of modern technology postgres and also specifically time scale so we can bucket data together and do aggregations together.” It enhances PostgreSQL specifically for time-series workloads. “Time scale is a postgres database so it’s still postres SQL but it’s optimized for time series.”
- Private API Deployment: The deployed API is intended to be private and only accessible to authorized internal resources. “This is completely private as in nobody can access it from the outside world it’s just going to be accessible from the resources we deem should be accessible to it…”
- Data Aggregation: The API will allow for aggregating raw web event data based on time intervals. “from this raw data we will be able to aggregate this data and put it into a bulk form that we can then analyze.”
- Time-Based Queries: TimescaleDB facilitates querying data based on time series. “…it’s much more about how to actually get to the point where we can aggregate the data based off of time based off of Time series that is exactly what time scale does really well and enhances Post cres in that way.”
- SQLModel for Database Interaction: The project utilizes SQLModel, which simplifies database interactions by combining Pydantic’s data validation with SQLAlchemy’s database ORM capabilities. “within that package there is something called SQL model which is a dependency of the other package…this one works well with fast API and you’ll notice that it’s powered by pantic and SQL Alchemy…”
- Importance of Virtual Environments: Virtual environments are used to isolate Python project dependencies and avoid version conflicts. “we’re going to create a virtual environment for python projects so that they can be isolated from one another in other words python versions don’t conflict with each other on a per project basis.”
- Docker for Environment Consistency: Docker is used to create consistent and reproducible environments for both development and production. “docker containers are that something what we end up doing is we package up this application into something called a container image…”
- Docker Compose for Multi-Container Management: Docker Compose is used to define and manage multi-container Docker applications, such as the FastAPI app and the PostgreSQL/TimescaleDB database.
- API Endpoints and Routing: FastAPI’s routing capabilities are used to define API endpoints (e.g., /healthz, /api/events) and associate them with specific functions and HTTP methods (GET, POST).
- Data Validation with Pydantic: Pydantic models (Schemas) are used to define the structure and data types of request and response bodies, ensuring data validity.
- Transition to SQLModel for Database Models: Pydantic schemas are transitioned to SQLModel models to map them to database tables. “we’re going to convert our pantic schemas into SQL models…”
- Database Engine and Session Management: SQLAlchemy’s engine is used to connect to the database, and a session management function is implemented to handle database interactions within FastAPI routes.
- Creating Database Tables with SQLModel: SQLModel’s create_all function is used to automatically create database tables based on the defined SQLModel models. “This is going to be our command to actually make sure that our database tables are created and they are created based off of our SQL model class to create everything so it’s SQL model. metadata. create all and it’s going to take in the engine as argument…”
- TimescaleDB Hypertable Creation: To optimize for time-series data, standard PostgreSQL tables are converted into TimescaleDB hypertables. “it’s time to create hyper tables now if we did nothing else at this point and still use this time scale model it would not actually be optimized for time series data we need to do one more step to make that happen a hypertable is a postgress table that’s optimized for that time series data with the chunking and also the automatic retention policy where it’ll delete things later.”
- Hypertable Configuration (Chunking and Retention): TimescaleDB hypertables can be configured with chunk time intervals and data retention policies (drop after). “the main three that I want to look at are these three first is the time column this is defaulting to the time field itself right so we probably don’t ever want to change the time colum itself although it is there it is supported if you need to for some reason in time scale in general what you’ll often see the time column that’s being used is going to be named time or the datetime object itself so that one’s I’m not going to change the ones that we’re going to look at are the chunk time interval and then drop after…”
- Time Bucketing for Data Aggregation: TimescaleDB’s time_bucket function allows for aggregating data into time intervals. “we’re now going to take a look at time buckets time buckets allow us to aggregate data over a Time interval…”
- Cloud Deployment on Railway: The project demonstrates deploying the API and a Jupyter Notebook to the Railway cloud platform for accessibility.
Quotes:
- “thing about all of this is all the code is open source so you can always just grab it and run with it and deploy your own analytics API whenever you want to that’s kind of the point”
- “The point of this course is to create an analytics API microservice we’re going to be using fast API so that we can take in a bunch of web traffic data on our other services and then we’ll be able to analyze them as we see fit and we’ll be able to do this with a lot of modern technology postgres and also specifically time scale so we can bucket data together and do aggregations together”
- “This is completely private as in nobody can access it from the outside world it’s just going to be accessible from the resources we deem should be accessible to it”
- “from this raw data we will be able to aggregate this data and put it into a bulk form that we can then analyze”
- “Time scale is a postgres database so it’s still postres SQL but it’s optimized for time series”
- “within that package there is something called SQL model which is a dependency of the other package…this one works well with fast API and you’ll notice that it’s powered by pantic and SQL Alchemy…”
- “we’re going to create a virtual environment for python projects so that they can be isolated from one another in other words python versions don’t conflict with each other on a per project basis”
- “docker containers are that something what we end up doing is we package up this application into something called a container image…”
- “it’s time to create hyper tables now if we did nothing else at this point and still use this time scale model it would not actually be optimized for time series data we need to do one more step to make that happen a hypertable is a postgress table that’s optimized for that time series data with the chunking and also the automatic retention policy where it’ll delete things later.”
- “we’re now going to take a look at time buckets time buckets allow us to aggregate data over a Time interval…”
This briefing provides a comprehensive overview of the source material, highlighting the key objectives, technologies, and processes involved in building the analytics API microservice. It emphasizes the open-source nature, the use of modern tools, and the focus on efficient time-series data analysis with TimescaleDB.
Building a Private Analytics API with FastAPI and TimescaleDB
Questions
- What is the main goal of this course and what technologies will be used to achieve it? The main goal of this course is to guide users through the creation of a private analytics API microservice. This API will be designed to ingest and analyze web traffic data from other services. The course will primarily use FastAPI as the web framework, PostgreSQL as the database, and specifically TimescaleDB as a PostgreSQL extension optimized for time-series data. Other tools mentioned include Docker for containerization, Cursor as a code editor, and a custom Python package called timescaledb-python.
- Why is the created analytics API intended to be private, and how will it be accessed and tested? The analytics API is designed to be completely private, meaning it will not be directly accessible from the public internet. It will only be reachable from internal resources that are specifically granted access. To test the API after deployment, a Jupyter Notebook server will be used. This server will reside within the private network and will send simulated web event data to the API for analysis and to verify its functionality.
- What is TimescaleDB, and why is it being used in this project instead of standard PostgreSQL? TimescaleDB is a PostgreSQL extension that optimizes the database for handling time-series data. While it is built on top of PostgreSQL and retains all of its features, TimescaleDB enhances its capabilities for data bucketing, aggregations based on time, and efficient storage and querying of large volumes of timestamped data. It is being used in this project because the core of an analytics API is dealing with data that changes over time, making TimescaleDB a more suitable and performant choice for this specific use case compared to standard PostgreSQL.
- Why is the course emphasizing open-source tools and providing the code on GitHub? A key aspect of this course is the commitment to open-source technology. All the code developed throughout the course will be open source, allowing users to freely access, use, modify, and distribute it. The code will be hosted on GitHub, providing a collaborative platform for users to follow along, contribute, and deploy their own analytics API based on the course materials. This also ensures transparency and allows users to have full control over their analytics solution.
- What are virtual environments, and why is setting one up considered an important first step in the course? Virtual environments are isolated Python environments that allow users to manage dependencies for specific projects without interfering with the global Python installation or other projects. Setting up a virtual environment is crucial because it ensures that the project uses the exact Python version and required packages (like FastAPI, Uvicorn, TimescaleDB Python) without conflicts. This leads to more reproducible and stable development and deployment processes, preventing issues caused by differing package versions across projects.
- What is Docker, and how will it be used in this course for development and potential production deployment? Docker is a platform that enables the creation and management of containerized applications. Containers package an application along with all its dependencies, ensuring consistency across different environments. In this course, Docker will be used to set up a local development environment that closely mirrors a production environment. This includes containerizing the FastAPI application and the TimescaleDB database. Docker Compose will be used to manage these multi-container setups locally. While the course starts with the open-source version of TimescaleDB in Docker, it suggests that for production, more robust services directly from Timescale might be considered.
- How will FastAPI be used to define API endpoints and handle data? FastAPI will serve as the web framework for building the analytics API. It will be used to define API endpoints (URL paths) that clients can interact with to send data and retrieve analytics. FastAPI’s features include automatic data validation, serialization, and API documentation based on Python type hints. The course will demonstrate how to define routes for different HTTP methods (like GET and POST) and how to use Pydantic (and later SQLModel) for defining data models for both incoming requests and outgoing responses, ensuring data consistency and validity.
- How will SQLModel be integrated into the project, and what benefits does it offer for interacting with the database? SQLModel is a Python library that combines the functionalities of Pydantic (for data validation and serialization) and SQLAlchemy (for database interaction). It allows developers to define database models using Python classes with type hints, which are then automatically translated into SQL database schemas. In this course, SQLModel will be used to define the structure of the event data that the API will collect and store in the TimescaleDB database. It will simplify database interactions by providing an object-relational mapping (ORM) layer, allowing developers to work with Python objects instead of writing raw SQL queries for common database operations like creating, reading, updating, and deleting data.
Analytics API for Time Series Data with FastAPI and TimescaleDB
Based on the source “01.pdf”, an Analytics API service is being built to ingest, store, and refine data, specifically time series data, allowing users to control and analyze it. The goal is to create a microservice that can take in a lot of data, such as web traffic data from other services, store it in a database optimized for time series, and then enable analysis of this data.
Here are some key aspects of the Analytics API service as described in the source:
- Technology Stack:
- FastAPI: This is used as the microservice framework and the API endpoint, built with Python. FastAPI is described as a straightforward and popular tool that allows for writing simple Python functions to handle API endpoints. It’s noted for being minimal and flexible.
- Python: The service is built using Python, which is a dependency for FastAPI. The tutorial mentions using Python 3.
- TimescaleDB: This is a PostgreSQL database optimized for time series data and is used to store the ingested data. TimescaleDB enhances PostgreSQL for time series operations, making it suitable for bucketing data and performing aggregations based on time. The course partnered with Timescale.
- Docker: Docker is used to containerize the application, making it portable and able to be deployed anywhere with a Docker runtime, and also to easily run database services like TimescaleDB locally. Containerization ensures the application is production-ready.
- Railway: This is a containerized cloud platform used for deployment, allowing for a Jupyter Notebook server to connect to the private analytics API endpoint.
- Functionality:
- Data Ingestion: The API is designed to ingest a lot of data, particularly web traffic data, which can change over time (time series data).
- Data Storage: The ingested data is stored in a Timescale database, which is optimized for efficient storage and querying of time-based data.
- Data Modeling: Data modeling is a key aspect, especially when dealing with querying the data. The models are based on Pydantic, which is also the foundation for SQLModel, making them relatively easy to work with, especially for those familiar with Django models. A custom Python package, timescale-db-python, was created for this series to facilitate the use of FastAPI and SQLModel with TimescaleDB.
- Querying and Aggregation: The service allows for querying and aggregating data based on time and other parameters. TimescaleDB is particularly useful for time-based aggregations. More advanced aggregations can be performed to gain complex insights from the data.
- API Endpoints: FastAPI is used to create API endpoints that handle data ingestion and querying. The tutorial covers creating endpoints for health checks, reading events (with list and detail views), creating events (using POST), and updating events (using PUT).
- Data Validation: Pydantic is used for both incoming and outgoing data validation, ensuring the data conforms to defined schemas. This helps in hardening the API and ensuring data integrity.
- Development and Deployment:
- Local Development: The tutorial emphasizes setting up a local development environment that closely mirrors a production environment, using Python virtual environments and Docker.
- Production Deployment: The containerized application is deployed to Railway, a containerized cloud service. The process involves containerizing the FastAPI application with Docker and configuring it for deployment on Railway.
- Private API: The deployment on Railway includes the option to create a private analytics API service, accessible only from within designated resources, enhancing security.
- Example Use Case: The primary use case demonstrated is building an analytics API to track and analyze web traffic data over time. This involves ingesting data about page visits, user agents, and durations, and then aggregating this data based on time intervals and other dimensions.
The tutorial progresses from setting up the environment to building the API with data models, handling different HTTP methods, implementing data validation, integrating with a PostgreSQL database using SQLModel, optimizing for time series data with TimescaleDB (including converting tables to hypertables and using time bucket functions for aggregation), and finally deploying the application to a cloud platform. The source code for the project is open source and available on GitHub.
FastAPI for Analytics Microservice API Development
Based on the source “01.pdf” and our previous discussion, FastAPI is used as the microservice framework for building an analytics API service from scratch. This involves creating an API endpoint that primarily ingests a single data model.
Here’s a breakdown of FastAPI’s role in the context of this microservice:
- API Endpoint Development: FastAPI is the core technology for creating the API endpoints. The source mentions that the service will have “well mostly just one data model that we’re going to be ingesting” through these endpoints. It emphasizes that FastAPI allows for writing “fairly straightforward python functions that will handle all of the API inp points”.
- Simplicity and Ease of Use: FastAPI is described as a “really really popular tool to build all of this out” and a “really great framework to incrementally add things that you need when you need them”. It’s considered “very straightforward” and “minimal”, meaning it doesn’t include a lot of built-in features, offering flexibility. The models built with FastAPI (using SQLModel, which is based on Pydantic) are described as “a little bit easier to work with than Django models” with “less to remember” because they are based on Pydantic.
- URL Routing: Similar to other web frameworks, FastAPI facilitates easy URL routing, which is essential for defining the different endpoints of the API service.
- Data Validation and Serialization: FastAPI leverages Pydantic for data modeling. This allows for defining data structures with type hints, which FastAPI uses for automatic data validation and serialization (converting Python objects to JSON and vice versa). We saw examples of this with the Event schema and how incoming and outgoing data were validated.
- HTTP Method Handling: FastAPI makes it straightforward to define functions that handle different HTTP methods (e.g., GET for retrieving data, POST for creating data, PUT for updating data) for specific API endpoints.
- Middleware Integration: FastAPI allows the integration of middleware, such as CORS (Cross-Origin Resource Sharing) middleware, which was added to control which websites can access the API.
- Integration with Other Tools: FastAPI works well with other tools in the described architecture, including:
- SQLModel: Built on top of Pydantic and SQLAlchemy, SQLModel simplifies database interactions and is used within the FastAPI application to interact with the PostgreSQL database (TimescaleDB).
- Uvicorn and Gunicorn: Uvicorn is used as an ASGI server to run the FastAPI application, particularly during development. Gunicorn can be used in conjunction with Uvicorn for production deployments.
- Docker: FastAPI applications can be easily containerized using Docker, making them portable and scalable.
- Production Readiness: The tutorial emphasizes building applications that are production-ready from the start, and FastAPI, along with Docker and Railway, facilitates this. The deployment process to Railway involves containerizing the FastAPI application.
In essence, FastAPI serves as the foundation for building the API interface of the analytics microservice, handling requests, processing data according to defined models, and interacting with the underlying data storage (TimescaleDB) through SQLModel. Its design principles of being high-performance, easy to use, and robust make it well-suited for building modern microservices.
TimescaleDB: PostgreSQL for Time Series Data
Based on the source “01.pdf” and our conversation history, TimescaleDB is a PostgreSQL database extension that is optimized for time series data. While it is still fundamentally PostgreSQL, it includes specific features and optimizations that make it particularly well-suited for handling and analyzing data that changes over time, such as the web traffic data for the analytics API service being built.
Here’s a more detailed discussion of PostgreSQL with the Timescale extension:
- Based on PostgreSQL: TimescaleDB is not a separate database but rather an extension that runs within PostgreSQL. This means that it retains all the reliability, features, and the extensive ecosystem of PostgreSQL, while adding time series capabilities. The source explicitly states, “time scale is a postgres database so it’s still postres SQL”. This also implies that tools and libraries that work with standard PostgreSQL, like SQLModel, can also interact with a TimescaleDB instance.
- Optimization for Time Series Data: The core reason for using TimescaleDB is its optimization for time series data. Standard PostgreSQL is not inherently designed for the high volume of writes and complex time-based queries that are common in time series applications. TimescaleDB addresses this by introducing concepts like hypertables.
- Hypertables: A hypertable is a virtual table that is partitioned into many smaller tables called chunks, based on time and optionally other criteria. This partitioning improves query performance, especially for time-based filtering and aggregations, as queries can be directed to only the relevant chunks. We saw the conversion of the EventModel to a hypertable in the tutorial.
- Time-Based Operations: TimescaleDB provides hyperfunctions that are specifically designed for time series analysis, such as time_bucket for aggregating data over specific time intervals. This allows for efficient calculation of metrics like counts, averages, and other aggregations over time.
- Data Retention Policies: TimescaleDB allows for the definition of automatic data retention policies, where older, less relevant data can be automatically dropped based on time. This helps in managing storage costs and maintaining performance by keeping the database size manageable.
- Use in the Analytics API Service: The analytics API service leverages TimescaleDB as its primary data store because it is designed to ingest and analyze time series data (web traffic events).
- Data Ingestion and Storage: The API receives web traffic data and stores it in the Timescale database. The EventModel was configured to be a hypertable, optimized for this kind of continuous data ingestion.
- Time-Based Analysis: TimescaleDB’s time_bucket function is used to perform aggregations on the data based on time intervals (e.g., per minute, per hour, per day, per week). This enables the API to provide insights into how web applications are performing over time.
- Efficient Querying: By using hypertables and time-based indexing, TimescaleDB allows for efficient querying of the large volumes of time series data that the API is expected to handle.
- Deployment: TimescaleDB can be deployed in different ways, as seen in the context of the tutorial:
- Local Development with Docker: The tutorial uses Docker to run a containerized version of open-source TimescaleDB for local development. This provides a consistent and isolated environment for testing the API’s database interactions.
- Cloud Deployment with Timescale Cloud: For production deployment, the tutorial utilizes Timescale Cloud, a managed version of TimescaleDB. This service handles the operational aspects of running TimescaleDB, such as maintenance, backups, and scaling, allowing the developers to focus on building the API. The connection string for Timescale Cloud was configured in the deployment environment on Railway.
- Integration with FastAPI and SQLModel: Despite its specialized time series features, TimescaleDB remains compatible with PostgreSQL standards, allowing for seamless integration with tools like SQLModel used within the FastAPI application. SQLModel, being built on Pydantic and SQLAlchemy, can define data models that map to tables (including hypertables) in TimescaleDB. A custom Python package, timescale-db-python, was even created to further streamline the interaction between FastAPI, SQLModel, and TimescaleDB, especially for defining hypertables.
In summary, TimescaleDB is a powerful extension to PostgreSQL that provides the necessary optimizations and functions for efficiently storing, managing, and analyzing time series data, making it a crucial component for building the described analytics API service. Its compatibility with the PostgreSQL ecosystem and various deployment options offer flexibility for both development and production environments.
Docker Containerization for Application Deployment
Based on the source “01.pdf” and our conversation history, Docker containerization is a technology used to package an application and all its dependencies (such as libraries, system tools, runtime, and code) into a single, portable unit called a container image. This image can then be used to run identical containers across various environments, ensuring consistency from development to production.
Here’s a breakdown of Docker containerization as discussed in the source:
- Purpose and Benefits:
- Emulating Production Environment: Docker allows developers to locally emulate a deployed production environment very closely. This helps in identifying and resolving environment-specific issues early in the development process.
- Production Readiness: Building applications with Docker from the beginning makes them production-ready sooner. The container image can be deployed to any system that has a Docker runtime.
- Database Services: Docker simplifies the process of running database services, such as PostgreSQL and TimescaleDB, on a per-project basis. Instead of installing and configuring databases directly on the local machine, developers can spin up isolated database containers.
- Isolation: Docker provides another layer of isolation beyond virtual environments for Python packages. Containers encapsulate the entire application environment, including the operating system dependencies. This prevents conflicts between different projects and ensures that each application has the exact environment it needs.
- Portability: Once an application is containerized, it can be deployed anywhere that has a Docker runtime, whether it’s a developer’s laptop, a cloud server, or a container orchestration platform.
- Reproducibility: Docker ensures that the application runs in a consistent environment, making deployments more reliable and reproducible.
- Key Components:
- Docker Desktop: This is a user interface application that provides tools for working with Docker on local machines (Windows and Mac). It includes the Docker Engine and allows for managing containers and images through a graphical interface.
- Docker Engine: This is the core of Docker, responsible for building, running, and managing Docker containers.
- Docker Hub: This is a cloud-based registry service where container images can be stored and shared publicly or privately. The tutorial mentions pulling Python runtime images and TimescaleDB images from Docker Hub. Images are identified by a name and a tag, which typically specifies a version.
- Dockerfile: This is a text file containing instructions that Docker uses to build a container image. The Dockerfile specifies the base image, commands to install dependencies, the application code to include, and how the application should be run. The tutorial details creating a Dockerfile.web for the FastAPI application, which includes steps like downloading Python, creating a virtual environment, installing requirements, and running the FastAPI application using a boot script.
- Docker Compose: This is a tool for defining and managing multi-container Docker applications. It uses a compose.yaml file to configure the different services (e.g., the FastAPI application, the database) that make up the application, along with their dependencies, networks, and volumes. The tutorial uses Docker Compose to run both the FastAPI application and the TimescaleDB database during local development. Docker Compose allows for defining build parameters (context and Dockerfile), image names, environment variables, port mappings, and volumes. The –watch flag can be used with Docker Compose to automatically rebuild the container when code changes are detected during development.
- Container Image: A lightweight, standalone, executable package that includes everything needed to run a piece of software, including the code, runtime, libraries, environment variables, and configuration files.
- Container: A running instance of a Docker image.
- Usage in the Analytics API Project:
- Containerizing the FastAPI Application: The tutorial focuses on creating a Dockerfile.web to containerize the FastAPI microservice. This ensures that the API can be deployed consistently across different environments.
- Running TimescaleDB: Docker Compose is used to spin up a containerized instance of the open-source TimescaleDB for local development. This simplifies setting up and managing the database dependency.
- Local Development Environment: Docker Compose, along with features like volume mounting, facilitates a stable local development environment that closely mirrors production. Volume mounting allows for code changes on the host machine to be reflected within the running container without needing to rebuild the image every time.
- Production Deployment: The containerized FastAPI application is deployed to Railway, a containerized cloud platform. Railway uses the Dockerfile.web to build and run the application.
In essence, Docker containerization provides a robust and efficient way to develop, package, and deploy the analytics API service and its dependencies, ensuring consistency and portability across different stages of its lifecycle. The source emphasizes using Docker early in the development process to prepare for production deployment.
Railway Deployment of Analytics API
Based on the source “01.pdf” and our conversation history, Railway is used as a containerized cloud platform to deploy the analytics API into production. The process involves several key steps to take the Dockerized application and run it in the cloud.
Here’s a discussion of deployment on Railway based on the provided information:
- Containerized Deployment: Railway is designed to deploy containerized applications. Since the analytics API is built using Docker, it is well-suited for deployment on Railway. The process leverages the Dockerfile.web created for the application, which contains all the instructions to build the Docker image.
- railway.json Configuration: A railway.json file is used to configure the deployment on Railway. This file specifies important settings for the build and deployment process, including:
- build command: This command tells Railway how to build the application. In this case, it points to the Dockerfile.web in the project directory.
- watchPaths: These specify the directories and files that Railway should monitor for changes. When changes are detected (e.g., in the src directory or requirements.txt), Railway can automatically trigger a redeployment.
- deploy configuration: This section includes settings related to the running application, such as the health check endpoint (/healthz). Railway uses this endpoint to verify if the application is running correctly.
- GitHub Integration: The deployment process on Railway starts by linking a GitHub repository containing the application code. The source mentions forking the analytics-api repository from cfsh GitHub into a personal GitHub account. Once the repository is linked to a Railway project, Railway can access the code and use the railway.json and Dockerfile.web to build and deploy the application.
- Environment Variables: Railway allows for configuring environment variables that the deployed application can access. This is crucial for settings like the database URL to connect to the production Timescale Cloud instance. Instead of hardcoding sensitive information, environment variables provide a secure and configurable way to manage application settings. The tutorial demonstrates adding the DATABASE_URL obtained from Timescale Cloud as a Railway environment variable. It also shows setting the PORT environment variable, which the application uses to listen for incoming requests.
- Public Endpoints: Railway can automatically generate a public URL for the deployed service, making the API accessible over the internet. This allows external applications or users to interact with the API.
- Private Networking: Railway also supports private networking, allowing services within the same Railway project to communicate internally without being exposed to the public internet. The tutorial demonstrates how to make the analytics API a private service by deleting its public endpoint and then accessing it from another service (a Jupyter container) within the same Railway project using its private network address. This enhances the security of the analytics API by restricting access.
- Health Checks: Railway periodically checks the health check endpoint configured in railway.json to ensure the application is running and healthy. If the health check fails, Railway might automatically attempt to restart the application.
- Automatic Deployments: Railway offers automatic deployments triggered by changes in the linked GitHub repository or updates to environment variables. This streamlines the deployment process, as new versions of the application can be rolled out automatically.
In summary, deploying the analytics API on Railway involves containerizing the application with Docker, configuring the deployment using railway.json, linking a GitHub repository, setting up environment variables (including the production database URL), and leveraging Railway’s features for public or private networking and health checks. Railway simplifies the process of taking a containerized application and running it in a production-like environment in the cloud.
The Original Text
so apparently data is the new oil and if that’s the case let’s learn how we can control that data and store it oursel and then refine it ourself now what I really mean here is we’re going to be building out an analytics API service so that we can ingest a lot of data we can store a lot of data into our database that changes over time or time series data the way we’re going to be doing this is with fast API as our micros service this is going to be our API inpoint that has well mostly just one data model that we’re going to be ingesting from there we’re going to be using of course python to make that happen that’s what fast API is built in but we’re going to be storing this data into a postgres database that’s optimized for time series called time scale I did partner with them on this course so thanks for that time scale but the idea here is we want to be able to keep track of time series data time scale is optimized postgres for Time series data it’s really great I think you’re going to like it now we’re also going to be using Docker in here to make sure that our application is containerized so we can deploy it anywhere we want and we can use the open-source version of time scale to really hone in exactly what it is that we’re trying to build out before we go into production which absolutely we will be now the idea here is once we have it all containerized we’ll then go ahead and deploy it onto a containerized cloud called Railway which will allow us to have a jupyter notebook server that can connect to our private API inpoint our private analytics server altogether now all of this code is open source and I did say it’s done in fast API which really means that we’re going to be writing some fairly straightforward python functions that will handle all of the API inp points and all that the thing that will start to get a little bit more advanced is when we do the data modeling and specifically when we do the querying on the data modeling now these things I think are really really fascinating but I ease you into them so I go step by step to make sure that all of this is working now I do want to show you a demo as to what we end up building at the end of the day we got a quick glance at it right here but I want to go into it a little bit more in depth now I do hope that you jump around especially if you know this stuff already and if you come from the D Jango world the models that we’re going to be building I think are a little bit easier to work with than Jango models there’s less to remember because they’re based in pantic which is what SQL models is based in as well which allows you to write just really simple models it’s really cool it’s really nice to do so those of you who are from D Jango this part will be very straightforward fast API itself is also very straightforward and a really really popular tool to build all of this out now the nice thing about all of this is all the code is open source so you can always just grab it and run with it and deploy your own analytics API whenever you want to that’s kind of the point so if you have any questions let me know my name is Justin Mitchell I’m going to be taking you through this one step by step and I really encourage you to bounce around if you already know some of these things and if you don’t take your time re-watch sections if you have to some of us have to and that’s totally okay I know I had to repeat these sections many time to get it right so hopefully it’s a good one for you and thanks for watching look forward to seeing you in the course the point of this course is to create an analytics API microservice we’re going to be using fast API so that we can take in a bunch of web traffic data on our other services and then we’ll be able to analyze them as we see fit and we’ll be able to do this with a lot of modern technology postgres and also specifically time scale so we can bucket data together and do aggregations together let’s take a look at what that means now first and foremost at the very end we are going to deploy this into production and then we will have a jupyter notebook server that will actually access our private analytics API service this is completely private as in nobody can access it from the outside world it’s just going to be accessible from the resources we deem should be accessible to it which is what you’re seeing right here once we actually have it deployed internally and private we can send a bunch of fake data which is what’s happening this is pretending to be real web events that will then be sent back to our API as you see here so this is the raw data so from this raw data we will be able to aggregate this data and put it into a bulk form that we can then analyze now of course this is just about doing the API part we’re not going to actually visualize any of this just yet but we will see that if I do a call like duration 2 hours I can see all of the aggregated data for those two hours now this becomes a lot more obvious if we were to do something like a entire month and we can actually see month over month data that’s changing but in our case we don’t have that much data it’s not really about how much data we have it’s much more about how to actually get to the point where we can aggregate the data based off of time based off of Time series that is exactly what time scale does really well and enhances Post cres in that way so if we actually take a look at the code the code itself of course is open source feel free to use it right now you can actually come into the SRC here into the API into events into models you can see the data model that we’re using if you’ve used SQL model before this will actually give you a sense as to what’s going on it’s just slightly different because it’s optimized for time series and time scale which is what time scale model is doing now if you’ve used D Jango before this is a lot like a d Jango model it’s just slightly different by using SQL model and something called pantic don’t don’t worry I go through all of that and you can skip around if you already know these things but the point here is that is the model we are going to be building to ingest a lot of data after we have that model we’re going to be able to aggregate data based off of all of that which is what’s happening here now this is definitely a little bit more advanced of an aggregation than you might do out of the gates but once you actually learn how to do it or once you have the data to do it you will be able to have much more complex aggregations that will allow you to do really complex you know averaging or counting or grouping of your data all of it is going to be done in time series which is I think pretty cool now when we actually get this going you will have a fully production ready API endpoint that you can then start ingesting data from your web services I think it’s pretty exciting let’s go ahead and take a look at all of the different tools we will use to get to this point let’s talk about some of the tools we’re going to use to build out our analytics API first and foremost we’re going to be using the fast API web framework which is written in Python now if you’ve never seen fast API before it’s a really great framework to incrementally add things that you need when you need them so it’s really minimal is the point so there’s not a whole lot of batteries included which gives us all kinds of flexibility and of course makes it fast so if you take a look at the example here you can see how quick we can spin up our own API that’s this right here this right here is not a very powerful API yet but this is a fully functioning one and of course if you know python you can look at these things and say hey those are just a few functions and of course I could do a lot more with that that’s kind of the key here that we’re going to be building on top of now if you have built out websites before or you’ve worked with API Services before you can see how easy it is to also do the URL routing of course we’re going to go into all of this stuff in the very close future now of course we’re going to be using Python and specifically Python 3 the latest version is 3.13 you can use that 3.13 3.14 you could probably even use 3.10 Maybe even 3.8 I’m going to most likely be using 3.12 or 13 but the idea here is we want to use Python because that’s a dependency of fast API no surprises there probably now we’re also going to be using Docker now I use Docker on almost all of my projects for a couple of reasons one Docker locally emulates a deployed production environment it does it really really well and you can actually see this really soon but also we want to build our applications to be production ready as soon as we possibly can Docker really helps make that happen now another reason to use Docker is for database services so if you just want a database to run on a per project basis Docker makes this process really easy and so of course we want to use a production ready database and we also want to use one that works with fast API and then is geared towards analytics lo and behold we’re going to be using time scale time scale is a postgres database so it’s still postres SQL but it’s optimized for time series now yeah I partnered with them on this series but the point here is we’re going to be building out really rich robust time series data and fullon analytics right that’s the point here we want to actually be able to have our own analytics and control everything so initially we’re going to be using Docker or the dockerized open- source version of time scale so that we can get this going then as we start to go into production we’ll probably use more robust service is from time scale directly now we’re also going to be using the cursor code editor this of course is a lot like VSS code I am not going to be using the AI stuff in this one but I use cursor all of the time now it is my daily driver for all of my code so I am going to use that one in this as well now I also created a package for this series the time scale DB python package so that we can actually really easily use fast API and something called SQL model inside of our fast API application with the time series optimized time scale DB that’s kind of the point so that’s another thing that I just created for this series as well as anything in the future that of course is on my private GitHub as far as the coding for entrepreneurs GitHub all of the code for this series everything you’re going to be learning from is going to be on my GitHub as well which you can see both of those things Linked In the description so that’s some of the fundamentals of the tooling that we’re going to be using let’s go ahead and actually start the process of setting up our environment in this section we’re going to set up our local development environment and we wanted to match a production or a deployed environment as closely as possible so to do this we’re going to download install Python 3 we’re going to create a virtual environment for python projects so that they can be isolated from one another in other words python versions don’t conflict with each other on a per project basis then we’re going to go ahead and do our fast API hello world by installing python packages and then of course we’re going to implement Docker desktop and Docker compose which will allow for us to spin up our own database that postgres database that we talked about previously and of course it’s going to be the time scale version The Open Source version of that once we have our database ready then we’ll go ahead and take a look at a Docker file for just our fast API web application so that that can also be added into Docker compose and again being really ready to go into production then we’ll take a look at the docker based fast API hello world now all of this you could skip absolutely but setting up an environment that you can then repeat on other systems whether it’s for development or for production I think is a critical step to make sure that you can actually build something real that you deploy for real and get a lot of value out of it so this section I think is optional for those of you who have gone through all of this before but if you haven’t I’m going to take you through each step and really give you some of the fundamentals of how all of this works just to make sure that we’re all on the same page before we move to a little bit more advanced stuff and advanced features so let’s go ahead and jump in to downloading installing Python 3 right now we’re now going to download install Python and specifically python 3.13 or 3.13 now the way we’re going to do this is from python.org you’re going to go under downloads and you’re going to select the download that shows up for your platform now there are also the platforms shown on the side here so you can always select one of those platforms and look for the exact version that we’re using which in my case it’s Python 3.13 and you can see there’s other versions of that already available that you can download with now this is true on Windows as well but the idea is you want to grab the universal installer for whatever platform you’re using that’s probably going to be the best one for you this isn’t always true for older versions of python but for newer ones this is probably great so we’re going to go ahead and download that one right there and if course if you are on Windows you go into windows and you can see there’s a bunch of different options here so pick the one that’s best for your system now if you want a lot more details for those of you who are windows users consider checking out my course on crossplatform python setup because I go into a lot more detail there now the process though of installing python is is really straightforward you download it like we just did this is true for Windows or Mac and then you open up the installer that you end up downloading and then you just go through the install process now one of the things that is important that it does say a number of times is use the install certificates we’ll do that as well just to make sure that all of the security stuff is in there in as well so I’m going to go ahead and agree to this installation I’m going to go ahead and run it I’m going to put my password in I’m going to do all of those things as you will with installing any sort of software from the internet now I will say there is one other aspect of this that I will do sort of again in the sense that I will install python again using Docker and specifically in Docker Hub so yeah there’s a lot of different ways on how you can install Python and use it but both of these ways are fairly straightforward okay so it actually finished installing as we see here and it also opened up the finder window for me for that version if you’re on Windows it may open this folder it might not I haven’t done it in a little while but the idea is you want to make sure that you do run this installation command here which if you look at it is really just running this pip install command we’ll see pip installing in just a moment but that’s actually pretty cool so we’ve got pip install uh you know the certificate you know command basically to make sure all that security is in there now once you install it we just want to verify that it’s installed by opening up another terminal window here and run something like Python 3 – capital V now if you see a different version of python here there’s a good chance that you have another version already installed so for example I have python 3.12 installed as well you can use many different versions of python itself all of these different versions are exactly why we use something called a virtual environment to isolate python packages again if you want a lot more detail on this one go through that course it goes into a lot more detail on all different kinds of platforms so consider that if you like otherwise let’s go ahead and start the process of creating a virtual environment on our local machine with what we’ve got right here part of the reason I showed you the different versions of python was really to highlight the fact that versions matter and they might make a big issue for you if you don’t isolate them correctly so in the case of version 3.12 versus 3.13 of python there’s not going to be that much changes in terms of your code but what might have a major change is the thirdparty packages that go in there like what if fast API decides to drop support for python 3.12 and then you can’t use that anymore that’s kind of the idea here and so we need to create a virtual environment to make that happen which is exactly what we talked about before so back into cursor we’re going to go ahead and open up a new window inside of this window we’re going to go ahead and open up a new project so I want to store my projects in here so I open up a new project I find a place on my local machine as to where I’m going to store it and we’re going to call this the analytics API just like that and I’ll go ahead and open up this folder okay so normally with cursor it’s going to open you up to the agent I’m not going to use the agent right now I’m just going to be using just standard code and we’re going to go ahead and first off save the workspace as the analytics API we’ll save it just like that then I’m going to go ahead and open up the terminal window which you can do by going to the drop down here or if you learn the shortcut which I recommend you do it can toggle it just like I’m doing okay so the idea here is we want to just of course verify that we have got Python 3 in there you can probably even see where that version of Python 3 is stored this actually shouldn’t be that different than what you may have saw when we installed called the certificates here but the idea of course is we’re going to use this to activate our virtual environment so I’m going to go ahead and do python 3.12 or rather python 3.3 13 and then do m venv v EnV so this is the Mac command for it this also might work on Linux if you’re on Windows it’s going to be slightly different which is going to be basically the absolute path to where your python executable is so it’s going to be something like that then- MV andv V andv okay so this is another place where if you don’t know this one super well then definitely check out the course that I have on it uh which was this one right here so just go ahead and do that okay so the idea is now that we’ve got this virtual environment all we need to do is activate it now the nice thing about modern text editors or modern code editors is usually when you open them up they might actually activate the virtual environment for you in my case it’s not activated so the way you do it is just by doing Source VMV and then b/ activate this is going to be true on Linux as well Windows is going to be slightly different unless you’re using WSL or you’re using Powershell uh those things might have a little different but more than likely Windows is going to be something like this where it’s uh VMV scripts activate like that where you put a period at the beginning that will help activate that virtual environment and so now what we do is we can actually do python DV notice the three is not on there and here it is and of course if I do which python it will now show me where it’s located which of course is my virtual environment and so this is the time where we can do something like python DM pip or just simply pip both of those are the python package installer and we can do pip install pip D- upgrade this is going to happen just on my local virtual environment it does not affect pip on my local machine which we can check by doing pip again if I do that notice that it’s not working right so it works in here where I do pip but it does not work in my terminal window nonactivated terminal window if I do python 3-m pip um then I’ll get it or rather just pip that will give me that actual thing and it’s showing me where it’s being used just like that same thing if I came back in to my virtual environment scrolled up a little bit it will show me something very similar here’s that usage if I do the python version it will show me the same sort of thing that we just saw but it’s going to be based off of the virtual environment just like that so that’s the absolute path to it of course it’s going to be a little bit different on your machine unless you have the same username as I do and you stored it in the same location but overall we are now in a place to use this virtual environment so let’s see how we can install some packages and kind of the best approach to do that now let’s install some pyth packages it’s really simple it’s a matter of pip install and then the package name that is how the python package index works if you’re familiar with something like mpm these are very similar tools but the idea here is if you go to p.org you can search all of the different P published python packages and pip install can install them directly from there so if we did a quick search for fast API for example we can see here is the current version of fast API that’s available and this is how we can install it now the key thing about these installations is just like many things there’s many different ways on how you can do this there are other tools out there like poetry is another tool that can do something like poetry ad I believe that’s the command for it but the idea here is there’s a lot of different ways on how you might use these different package names I stick with the built-in modules cuz they are the most reliable for the vast majority of us now once you get a little bit more advanced you might change how you do virtual environments and you also might change how you install python packages but the actual python you know like official repository for all of these different packages is piie and they still say pip install so it’s still very much uh a big part of what we do okay so the idea here now is we need to install some packages so how we’re going to do this is we’re going to go ahead and once again I’m going to open up my project and in my case I actually closed it out mostly so I can show you the correct way to install things here’s my recent project here with that workspace all I have is a virtual environment and that code workspace in there now if I were to toggle open the terminal it may activate the virtual environment it may not so the wrong way to do this is to just start trying to do pip install fast API in this case it says commands not found the reason this is the wrong way is cuz I don’t have the virtual environment activated so I have to activate that virtual environment just like that if I need to manually do it if for some reason the terminal is not automatically doing it then we can do the installation so we can go ahead and do pip install fast API and hit enter and just go off of that and that’s actually well and good except for the fact that I don’t have any reference inside of my project to the fast API package itself so I have no way to like if I were to accidentally delete the virtual environment I have no way to like recoup what I did so what we need to do then is we create something called requirements.txt once again this is another file that could be done in different ways and there are other official ways to do it as well but one of the things that’s nice about this is inside of this file we can write something like Fast API and then I can do pip install d r requirements.txt which will then take the reference from this file assuming that everything’s installed and saved and so once I save it I can see that that’s the case it’s now doing it this comes in with the versions then so what we do then is we can actually grab the version and say it’s equal to that version right there in which case I can run the installations now this is actually really nice because it kind of locks that version in place now if you go into the release history you could probably go back in time and grab a different version like 011 uh 3.0 if we do that so 0113 and then 0.0 I save that now I run that installation it’s going to take that old one and it’s going to install the things that are relative to that uh inside of my entire environment now in my case I actually want to use this version right here and a lot of times I would actually do another tool to make sure that this version is correct that other tool is called pip tools which I’m not going to cover right now but the idea here is we want to keep track of our requirements and fast API of course is one of them now within fast API we have something else called uvicorn that we will want to use as well this is often hand inand with using fast API so once again we see pip install uvicorn and you can just do something like that now there is something else with uvicorn that we might want to use and that’s called gunicorn so G unicorn and we do a quick search for that this is for when we want to go into production G unicorn and uvicorn can work together so once again I’ll go ahead and bring that in uvicorn and gunicorn they are um great tools for running the application in production but we usually don’t have to lock down their version I think once you go into production you need regular things then yeah you’ll probably want to lock down the version the more important one is probably fast API but even that we might not need to lock down our version at this stage like if you’re getting something from what I’m telling you right now then you probably don’t need to lock down the version yet if you already know oh I need to lock down the version then you’ll probably just do it anyway that’s kind of the point that’s why I’m telling you that um but the idea here is of course we’ve got all these different packages and then of course my package that is definitely a lot newer you do a quick search for it and it’s time scale DB it is not a very like there’s not that many versions of this package so this is another one that I’m going to go ahead and not lock down because I definitely don’t want that so within that package there is something called SQL model which is a dependency of the other package so it’s this one right here this one also doesn’t have that many versions itself uh but it’s pretty stable as it is and this one works well with fast API and you’ll notice that it’s powered by pantic and SQL Alchemy which means that we probably want to have those in there as well just so we have some reference to it so I’m going to go ahead and bring that in here with pantic and SQL Alchemy there we go so now that we’ve got all of the different versions here I can once again do pip install R requirements.txt and then I’ll be able to install all of the packages as need be now here’s the kicker this is why I’m showing you is because what you want to think of your virtual environment as just for the local environment you are not going to be using it in the production environment you’ll be doing a completely different one what’s interesting is my terminal now opens up that virtual environment and attempts to activate it which is pretty funny but I don’t actually have one yet so I’m going to go ahead and re bring it back now I’ve got that virtual environment back of course if you’re on Windows you might have to use something different here but I’m going to go ahead and reactivate it with bin activate and then we’ll go ahead and do PIV install r requirements. txt hit enter and now I’m getting all of those requirements back for this entire project this is going to come up again we will see it for sure because it’s really important okay so now we’ve got our requirements we’ve got our virtual environment it’s time to actually do our fast API hello world before we go much further I will say if there are file changes within the code it will be on the official GitHub repo for this entire series as to what we’re doing going forward so in the in other words in branches of start this is the branch we’re going to finish off with of course this one doesn’t have everything in it like we just did but it will and so that’s the start Branch that’s where we’re leaving off but as you see there’s license and read me those things I’m not going to show you how to do or the get ignore file uh they’re just going to show up in just a moment but now that we’ve got this let’s go ahead and actually create our first fast API application it is very straightforward we now want to do our fast API hello world so the way we’re going to do this is by creating a python module with some fast API code and just run it the key part of this is just to really verify that our package runs that it can actually go so to do this we’re going to go ahead and go back into our project here and of course I want to open up my terminal window if the virtual environment activates that’s great if it does not activate like that then we want to activate it manually every once in a while when you create a virtual environment you might see this where it automatically activates we hope that it will start to automatically activate I’ll mention as to when it might happen but the idea here is we want to activate that virtual environment so Source VMV bin activate or of course whatever you have on your machine then I’m going to go ahead and do pip install R requirements.txt it’s not going to hurt to run that over and over and over again right it never will because if it’s already installed it’s not going to do anything cool other than tell you that it’s already installed so the idea here is we want to make sure that it’s installed so we can actually run it the next question of course how do we run it well if we go into the fast API documentation and we go to the example we see here some example python code so it says create a file main.py with all this stuff so the question then is where do we actually put this and that’s going to be as simple as you could do it right here in main.py you can copy this code paste in here save it with command s which is exactly what I do a lot I won’t actually show you that I’m saving it I’ll just save it I just saved it like 10 times just there so now that we’ve got that we actually want to run this code right so how do we actually run it well going back into the docs again you scroll down a little bit you see that it says run it right here and so I’m going to go ahead and attempt to run it with fast API Dev main.py hit enter I get an error so there’s a couple things that I want to fix before this goes any further but the idea here is this may might make you think think that oh we didn’t install fast API correctly and in a way we didn’t we didn’t install it to use this particular command line tool it is a pce ma command line tool it’s not necessarily going to be there by default in this case it’s simply not there now in my case you could I could consider using this as my development server for fast API what I actually want to use is uvicorn because it’s closer to what I would use in production and it’s actually what fast API is using anyway as you look in the documentation you can scroll down a little bit it says we’ll watch for changes uvicorn running and all that so it’s basically uvicorn anyway so that’s what we want to do is we want to have uvicorn run our our actual main app so you use uvicorn then this is the command if you hit enter you should be able to see this now if you’re on Windows you might actually have to use waitress at this point waitress is just like uvicorn but that might be the one that you end up using we’ll talk about solving that problem in just a little bit when we go into dock but for now um I’m going to be using uvicorn and of course if you are on Windows go ahead and try that fast API standard that will probably work for you as well okay so the idea here is now we want to actually run this application with uvicorn and then the way we do that is we grab main.py here and then we use colon looking back into main.py we look for the app declaration for fast API which is just simply app and then we can do something like D- reload hdden enter oh that trailing slash should not be there so I’ll try that again we hit enter and now once again it says we’ll watch for these changes and it’s running on this particular Port just like what we see right in here great it just doesn’t show us this serving at and docs at and stuff like that as well is there’s not a production version of this just yet that we’ll use we will see it though okay great so now I can open this up with command click or just copy this URL you can always just control click and then open it up like that but the idea is hello world it’s running congratulations maybe this is the first time you’ve ever run a fast API application and it is up and running and it’s working on your local machine now don’t get too excited because you can’t exactly send this to anybody right so if I were to close out this server which we have a few options to do so one is just doing contrl C which will do that it just literally shuts it down another option is just to you know kill the terminal and then you can open up the terminal again in which case ours right now we need to then reactivate it and then we can run that command Again by just pressing up a few times we’ll be able to find it and there it is okay great so now we have it running what can we do we well we need to actually put it in a location that makes more sense than where it is right now main.py is in the root folder here this is basically something that no one ever does when you start building you know professionally you often put it inside of SRC main. P like that notice that I put that that slash in there that’s important because as soon as I hit enter it actually creates a folder for me with that new module which is where I’m going to go ahead and actually paste all of that code again and then I’ll just go ahead and delete this one with command backspace that allows me to delete it of course there’s other ways to delete it but there we go okay so now we can verify that fast API is running it’s working it’s ready to go so if you’re on Windows and you have Docker installed or if you just have Docker installed the next part is going to allow for us to build on top of this or the next half of this section is going to help us build on top of this so that we can actually use this code a little bit more reliably than we currently do let’s say you want to share this application with one of your friends and you want to help them set it up to run it what you’d probably tell them is hey download python from python.org make sure that you create a virtual environment install these requirements and then use uvicorn to run main.py and then you remember oh wait you also have to activate the virtual environment then install the requirements then run with u viacor then you might remember oh make sure you download python 3.13 because that’s the version we’re going to be using so there is some Nuance in just the setup process to make sure that this is working correctly that of course you and your friend could figure out by talking but it would be better if you could just give them something and it just worked docker containers are that something what we end up doing is we package up this application into something called a container image and we do it in a way very similar to what we’ve been doing which is download install the runtime that you want to use like Python 3 create a virtual environment activate that virtual environment install the requirements and then run your P your you know your application your fast API application so we will get to the point where we actually build out these things very soon but right now I want to just show you how you can run them how you can just use them by downloading Docker desktop so if you go to dock. and hit download Docker desktop for your platform you’ll be able to get the docker desktop user interface just like this but you will also more importantly get Docker engine to be actually able to run Docker containers to be able to create them and to be able to share them the key parts of using Docker now Docker itself can be really complex so I want to keep things as straightforward as possible so it’s very similar to what we did with python when we downloaded and installed it but we’re going to do that the docker way right now so at the point that you download it you’re going to go ahead and open it up on your machine and it’s going to look something like this UI right here now if for some reason this isn’t popping up and it’s just in your menu bar you might actually see that as soon as you exit out it might be something like this in which case you’ll just go to the dashboard and that will actually open it back up that’s just off of the recording screen for me right now which is why you’re seeing it this way but the idea here is we now have the docker desktop on our local machine now I also want to verify that this is working correctly because if I don’t verify it then it’s going to be hard to run in general so opening up the terminal I should be able to do Docker PS Docker PS really just shows me all of the things that are running on my machine we’ll come back to this in just a moment but the idea is if this runs an error you don’t have it done correctly the error will look something like that so just type in gibberish that’s the error right command not found in this case so I want to make sure that that’s there and I also want to make sure that Docker compose is there we’ll come back to Docker compose really soon but for now we’re just going to have those two commands now I’m going to be giving you a very minimal version of using Docker and learning how to use Docker over the next few videos or next few parts of this section now the reason for that has to do with the fact that we need it for a very stable local development environment but also so we can put push this into production so let’s actually take a look at how we can actually use a Docker container image right now so on Docker desktop there’s this Docker Hub right here there’s also Docker Hub if you go to hub. deer.com both of these things are just simply Docker Hub and you can search for other container images it’s not surprising that if you look at the docker Hub on Docker desktop versus the website they look basically the same that’s of course makes a whole lot of sense now in our case we’re going to go ahead and do a quick search for python we want to do the python runtime so when we went to python.org we went to downloads and you could have gone to one of your platforms here and you could actually grab a specific version of python so this is true on dockerhub as well but instead of saying download it’s just a tag so you just go into tags here and you can find different versions of python itself you can get the latest version which often is a good idea but just like what we talked about with the actual versions of you know packages or python software that we’re using a lot of times you want to use a very specific version to make sure that all of these things work as well so going back into uh you know the dockerhub we’ll do a quick search for Python 3.13 and so what we see in here is python 3.13 the one we’ve been using here’s 3.1 3.2 and of course if we go back into you know actual python releases we can see there’s 3.12 right there and and it’s the same one exact same one is through Docker one is directly on your local machine so the one through Docker is going to work on Linux Windows Mac it’s going to work crossplatform it’s really great that it does that but I actually don’t want to use the version that I already have on my machine I want to use an a really old version so Python 3.16 and we’ve got relevant and it has vulnerabilities there’s a lot of security concerns with using this old one for our production systems but for this example there’s no con concerns at all we can delete it it’s really simple to do so the way we actually get this thing working is by copying one of these pole commands so we can scroll on down I’m going to go ahead and just grab the one that is 3.6.1 I’m going to go ahead and copy this command here go into my terminal and I’ll go ahead and paste that in with that pole command so this is going to go ahead and download python 3.1 or 3.6.1 5 directly from dockerhub it downloads it in the image or the container form so we can now run this with Docker run python Colin 3. 6.15 and hit enter and it immediately stops so the reason it immediately stops is because Docker itself has a lot of features it can do a lot of things one of those things is we can do an interactive terminal for this which is- it with that same Docker container image and we can hit enter there now it actually opens up python for me it’s quite literally in the python shell much like if we were to open up a new terminal window here and type out Python 3 in there that is also a python shell right the difference here is if we actually clear this out I can just run it at any time and I can change the version there’s also another difference that says Linux up here here and Darwin down here that’s because on my local machine I’m on on a Mac but in the docker container I’m on a Linux so Docker containers are basically Linux with a bunch of things already set up now the other cool thing about this is I can exit out of here of course and then I can run python 3.7 and now it says it can’t find it so it’s going to go ahead and download it for me all the while on my local machine if I try to do python 3.7 it’s not available python 3.6 not available very similar to using a virtual environment but it’s another layer of isolation and it allows us to have these packaged run times that we can use at any time which is fantastic so the other part about this of course is that inside of Docker desktop we can delete these old ones that we don’t want to use and you’re going to want to get in this habit sometimes to make sure that you don’t have these old versions of python on your machine because they’re rather large that’s why there’s other versions or other tags of it right so if you go back into Docker Hub you’ll see that there’s this slim Buster look how much smaller of a an actual python project it is it’s way smaller so you can just grab that version and then you can come in here and do the same thing Docker run-it that version right there this is going to download it it’s going to be much smaller which means that it probably doesn’t have nearly as many features out of the gates it’s still on Linux but we can do you know all sorts of python things in here as you would and then you can then see inside of Docker desktop all of the things that are downloaded in here and then the slim Buster is quite a bit smaller than these other ones and so another thing about Docker desktop that’s really nice is you can come in here you can search for Python and you could do a quick search which gives us all of these ones that are using the python image all of these are running versions of the application which you can stop and delete which you would want to do and then you go into your images you can also delete these as well just like that which gives you back 2 gigs of space which you’re definitely going to want to get in the habit of doing in which case then when you want to run it again in the future it will just go ahead and redownload that image right so it’s really meant to be ephemeral like this you’re meant to think of Docker as something that is temporary so you can add it remove it you know run it when you need to and then delete it when you don’t right and so every once in a while you’ll see hey you can’t delete it because it’s running so then that means you just go into the containers here these are ones that are running you just go ahead and delete that you go back into the images and then you can delete it worst case scenario you just literally shut down Docker desktop open it back up and you can see what’s running with Docker containers in here or you can use Docker PS to see if anything’s running right now I don’t have anything running which is very clear in the desktop as well as my terminal okay so we’ll continue to use this I don’t expect you to know all of Docker at this point you shouldn’t know all of Docker at this point instead you should just be able to benefit from using Docker which will allow for us to do do all sorts of Rapid iteration and isolate our projects from each other on a whole another level well beyond what virtual environments can do and it will allow us to have as many database instances as we might want which is what we actually want to do very soon by using time scale itself so that is the docker desktop and some things about Docker compos now I actually want to get fast API ready we want to actually build our own container for fast API or at least the makings of it then we’ll go ahead and take a look at how we can develop with Docker itself we are now going to bundle our fast API application as a container the way we do this is by writing some instructions in a specific syntax so that Docker can build the container from those instructions and our code so this is called a Docker file so we’ve got a production Docker file for fast API the reason we’re doing it now is because the sooner we can have our project optimized for production the sooner we can actually go into production and share this with the world or anyone who has a Docker runtime those things are the key part of this so the actual Docker file we’re going to be using is on my blog so it’s here this is a production version that you can just go ahead grab and run I’m going to make a modification to it for this project but that’s the general idea here here so I’m going to open up my project now I’m actually going to copy these steps you don’t have to copy them but I just want to lay them out so I know where I want to go with this instruction this Docker file so if I go into Docker file like this no extension I’m just going to go ahead and paste this out and these are the instructions I wanted to do first off is download Python 3 create a virtual environment install my packages and then run the fast API application those four steps are really what I want to have happen now the way this works is very similar to like when we ran our python application itself right when we ran it just with Docker itself we can see that it is in Linux and here is our docka application so it really starts at this tag here the container and its tag so the way we find this of course is by going into Docker Hub looking for a container image like we did looking for the version that we want to use and then also the tag that we want to use which of course is going to be 3 uh 13. 2 which we had locally that’s the one we downloaded in the first place we’re going to use that same one for our Docker container so the big difference here though is we don’t want to use the big one because it’s massive it’s massive all across the board we want to use a small one which is called slim Bullseye so that’s the one we actually want to use and so the idea here is very similar to what we have with the slim Buster from the original python download in the docker container we’re going to do something very similar to this the way it works is we say from this is the docker um you know Syntax for it the actual image that we want to use which we could use latest this might be 3.13 this might be 3.14 this might be 3.15 we want to be very specific about the version we’re going to use which again 3.13 and then I think it was2 is the Baseline one if you do slim Bullseye slim blls eye like that it will be a much smaller image as soon as you do a much smaller image we lose some of the default things that might come within Linux itself when we lose that that means we need to also install our Linux environment so the next step might be you know set up Linux OS packages right so if you were going to deploy this directly on a Linux virtual machine you would need to do that same idea that same concept here now I could go through all all of these steps with you or we can just jump into the blog post and copy it because this is not a course about Docker so I’m going to go ahead and copy what’s in this blog post and we’ll go ahead and bring it right underneath those comments and I’m going to go ahead and modify this a little bit to make it work for our project the first thing is notice that I’ve got this argument in here we actually don’t need that argument we’re just going to go and stick with that single one right there and then I’ll just go line by line and kind of explain what’s going on first off we create that virtual environment no big deal this time it’s going to be in a specific location this is like as if we were setting up a remote server we would want our virtual environment in one location because that one server is probably going to have only one virtual environment for this particular application which is what we’re doing with our container this one container is going to only have one python project but we still want to use the virtual environments to isolate any python that might be on the operating system itself so now that we’ve got this virtual environment here this little Command right here makes it easy so that we don’t have to activate it each time it’s just activated and so we can run the PIP install command and upgrade pip we do python related stuff here’s the OS dependencies for our mini virtual machine or our Mini Server here we can then install things like for postres or if you’re using something like numpy you would have other installations in here as well or if you wanted to have git in here you could install that as well so a lot of different things that you can do on the OS level and that’s it right there so that’s one of the cool things about using Docker containers themselves is you also control the operating system not just the code so what we see now is we are making a directory in this operating system this mini one called code we have a working directory in here also called code in other words we are just going to be working inside of there then we copy our requirements file hey what do you know requirements.txt into an absolute location this doesn’t matter we could actually not have it in an absolute location but it is nice that it is in one because later when we need to install install it that’s what we do and so what we see here is it copies the code into the container’s working directory in other words it’s copying main.py into a folder called code no big deal now we’ve got a few other things that we probably don’t need in here um and then the final one is actually running a runtime script A bash script to actually run this application and then removing some of the old files to reduce image size and then finally running the actual script itself so what I want to do before I say this is done is I actually want to create something called boot slocker run. SH now the reason I’m doing this is because all of us are going to need to know what it is that’s going on with our application at any given time and this is what it’s going to be so we first off make this a sh file or bin bash file so that the Linux virtual machine will be able to run it the Linux container will be able to run it then we want to CD into the code folder we also want to activate the virtual envir M then we have runtime and variables to actually run our application which is going to be an SRC main app and that is a little bit different than what we’ve got here so I’m going to go ahead and just keep it in as main app just for now we might change this in the future but this is going to be our script to actually run our application and so to use that script what we do then is we are going to go ahead and copy theboot slocker run. sh and we’re going to go ahead and copy that that into opt runsh so instead of this script here we’ve got a new one and that’s going to be the name of it then we want to do the Cho mod to make it executable then that’s what we’re going to end up running is that script right there and make sure it has a.sh and that’s it so of course I still need to test and make sure that this is working it’s not necessarily working already so I will have to make some changes in here just to make sure that that’s the case that’s not something we’re going to worry about yet the main thing here is that we have a Docker file and that we’re going to be able to build it really really soon this Docker file most likely won’t change that much if anything the blog post will give the updates to the change or the actual code itself will have this Docker file in here that’s the key of Docker files they don’t need to change that much the only thing that will change most likely would be the python version that you end up using over time and then the code that’s going in but everything else related to this is probably going to remain pretty static that’s also why in the blog post the way we run the actual application itself is written right in line it’s actually a script that’s created right in line but having it external makes it go a little bit faster than what we have right here now I realize some of you aren’t fully ready to learn the inss and outs of Docker but we want to be as close to production as possible which is why that blog post is exists and it’s why you can also just copy this stuff personally I think looking at this it’s hopefully very clear as to the steps that need to happen to recreate our environment many many many times over so that it’s a lot easier to share it whether it’s with somebody else who has a Docker runtime or with a production system over the next few parts we’re going to be implementing the docker compose based fast API hello world but before we get there we need to still see some things about Docker just so you have some Foundation as to what’s happening in Docker compose so jumping back into our project here we’ve got this Docker file these are the instructions to set up the tiny little virtual machine or the tiny little server to run our application right it bundles everything up on our code so we need to actually be able to build out our application so there’s really two commands for this I’m going to put them in our read me here so we’ll go ahead and do Docker in here and we’ll go ahead and do the two commands and it’s Docker build and then it’s going to be Docker Run Okay so build is what it sounds like it’s going to build our bundled container image and the way we do it is we tag it something hey these tags what does that remind you of hopefully it reminds you of a few things a tag like this and in our terminal a tag like this right so we need to tag it in the same way it’s going to be tagged on our account if we were pushing it into dockerhub which we’re not but we still need to tag it so either way we’re going to build it and tag it then we want to say hey where are we building this file well we’re going to build it in the local folder which is just period there now we also want to specify the docker file we’re going to use which is how we do it like that now if you don’t specify the docker file it’s just going to go based off of that as in no other Docker file the reason being is you can have multiple Docker files like docker file. web in which case you would want to specify something like Docker file. web we aren’t doing that we’re just using the one single Docker file but it’s important to know about if you were going that direction on how you go about building it once you build it then you run it hey what do you know build it then run now we’ve already seen that command a little bit as well too which was Docker run and then that python command like this now the thing about this run command is there are a bunch of arguments that can go in here there can be the- it argument like we saw with python we ar going to spend any time with the arguments in here in fact this is all we want to see for our arguments at this point because we actually want to use Docker compose because it will do all of this for us as we’ll come to see so the idea here is we want to build out the container let’s go ahead and do that in the root of our project right next to Docker the docker file itself I’m going to go ahead and run this command and it’s going to go ahead and build this out for me now in my case it actually went really fast cuz there’s this cache in here I actually did test this and built it already so if I go into Docker desktop I can see the image that was built in here which will be my image my actual Docker analytics app image in here so I have got a few of them in here and the reason that I have a few is because I was testing this out but of course you can delete these just like we’ve seen before in this case we’ve got one that I can’t actually delete yet so go back into my containers in here and let’s go ahead and get rid of the search bar and I’m going to go ahead and stop all of my containers and delete all of them and then I’ll go back into my image here and I’m going to delete this one as well just so I can see it being built out I also wanted to show you that’s what you do if you want to get rid of it whether it’s your app or someone else’s and so once again it still has cash in here so it’s going really really fast which is super nice but sometimes it will take a little bit longer than that now there’s this other Legacy warning that we’ve got in here CU one of the docker files needs to be changed a little bit to having equals instead of that space bar there so I’m going to go ahead and do that and I’m going to try and build it again cool so it builds really fast great so now I’ve got a container image that I can run it’s not one that’s public it’s only on my local machine it goes public when I push it into dockerhub which like I said we’re not going to do at this point so to run it I just go ahead and do Docker run and then whatever that tag is in our case that tag is anal analytics API which we can also verify inside of our images in here there’s the the tag itself well actually it’s the name of the container with latest in here so it actually def defaults to latest that’s the tag that’s how you actually can do it it’s going to default to that if you don’t specify so if you were to specify one it would just be like that again we’ll do some of this stuff with Docker compose in just a little bit so now all I really want to do is verify that I can run this application by doing Docker run and there it is it’s now running the application itself if I try to open this application it’s not going to work we will make it work in a little bit the reason it’s not working is because of how Docker Works itself everything needs to be explicit to make it work so in order for it to run on a specific Port we also have to let Docker know about that the application itself doesn’t have to let Docker know about that we’ve got a lot of control over how all of that works so what I want to see now is how to do these two things inside of Docker compost off the video I went ahead and stopped that container and deleted the built image so that I could then run the command Docker run analytics API which of course failed it’s not locally and it’s also not on Docker Hub so it just can’t run it so that’s a little bit of an issue that actually is overcome by Docker compose so if we go in here and do compose diamel we can actually start specifying the various Services we might need so the very first key in this yaml is going to be services in inside of there are going to be all of the container images we might want to use like a database or our app in this case we’ll go ahead and just work off of our app the nice thing about modern tooling is a lot of times you can just run Individual Services right inside of the ammo file it also might depend on an extension that I have installed but the point here is we want to make sure that we have these nested key value pairs Here app is just what I’m calling it I could call it SRC I could call it web app I could call it a whole lot of things what we call it is going to make more sense or it’s going to matter more later when we use something a little bit different than just app okay the idea here in then is we’ve got our app and now we want to specify the image name so what do we want to call it well we could call it the same image name as in analytics API and this time we can say something like V1 okay great next up what we can do is Define the build parameters here and that is going to be our context which is going to be the local folder this should remind you of this dot right here so that’s the context that we’re looking at and then the next part is specifying the docker file we’re going to use relative to the composed file so we’ll set Docker file and it’s going to be Docker file just like that now if we had this as Docker file. web which you might do at some point then you would just change this to web as well let’s actually keep it like that so it’s a little bit more clear as to what’s going on in here now if we do change it one note I will change is inside of my little command here I would want to change that one as well just in case I wanted to build it individually okay so the idea now is I’ve got my Docker image and some build parameters now you can actually add additional build parameters in here this is all we’re going to leave though is the context and the docker file in part because it actually matches what we did to build it in the first place but the other part is well we probably don’t need a whole lot of context just to build it because that’s what the docker file is for the docker file has a lot of those context things that you might need to build it now to run it it’s a whole another story so to run it what we did was well actually if I just leave it like this and try to run it we’ll see something interesting so I’m going to go ahead and clear this out and then do Docker compose up hit enter what this will do is it starts to run it but it actually ends up building the application itself and then it goes to run it which it is now running and of course if I were to go to this actual place it will still not work right okay so we’ll get to that in a second but it built and ran it basically the same way but now we specify the image here and so the command I need to remember is just simply Docker compose up and if I want to take it down I can open up another terminal window and do Docker compose down that will stop that container application from running which I think is pretty nice okay and then we could also go into Docker desktop we can take a look at the images in here and look for our analytics API and what do you know there’s that V1 tag in there as well uh which makes things a little bit nicer and easier to see what’s going on so the image was built with that tag is kind of the point okay great so now what we want to do though is we want to be able to actually access this inpoint this URL here so the way it works with Docker this is true whether it’s Docker compos or just straight Docker itself as in this run command we actually need a specify ports so part of the reason that I actually had ports in the first place like inside of my Docker run is so that I can specify a different port right so this port value this environment variable we want to play around with this in just a moment so the way we play around with this is going to happen uh next to Ports so before I actually do the ports let’s go ahead and change the environment variable so we’re going to go ahead and come in here and not to use entry point but
yeah the rather use environment and set key value pairs so the port value you want to use so let’s go ahead and use port value 8,000 and2 I’m not going to do the ports just yet we’ll just change the port and then I’ll go ahead and do Docker compose up notice that the environment variables have changed if I try to open it once again it still is not accessible okay so this is in part because what I changed was runtime arguments this little thing changed a pretty big change inside of the application because of this environment variable or more specifically this one right here so inside of our composed. gaml we’ve got this port value in here now the reason I’m mentioning this is because at some point we will have a database URL in here and we’ll be able to pass in what that argument is another thing that we can do inside of using you know just Docker compose is do an EnV file and you can do something more like EnV and which case you could just come in here and Dov and this being Port like 801 so right now I have conflicting environment variable values so let’s actually see what that looks like I’m going to go ahead and call Docker compose down and then we’ll go ahead and bring it back up in just a second okay took a few seconds for that to finish but now if I do Docker compose up it’s still Port 802 in other words these hardcoded environment varibles override the environment variable file so if I were to get rid of that Port it would still be uh you know whatever the environment variable value is uh so for now I’ll just leave them the same so there’s not any confusion but the idea here is we have the ability to have environment variable files now we can add more of them by just using another line here you can do something like em. sample or any other kind of environment variable files and of course it has to be in this format of key equals some value okay great so the final step here is really just to expose this port so I’m going to go ahead and bring this down and the way this works is we declare ports in here and our this is super cool so what it shows us is our host port and our container Port so if I click on that we’ve got host Port of 8080 goes to container Port of 80 if you’re not familiar with what the container Port is that is going to be this number right so Port 802 and then the host Port is our system what port do we want to access Port 802 on this so this sometimes doesn’t work as intended so we’re going to go ahead and try it out we’ve got Docker composed up here’s 802 and then we exposed Port 8000 so inside of my Local Host here I’m going to try to do port 8080 and see if I can connect so depending on how your system ends up being designed this might work and it also might not work uh so what you want to typically do is default to your local host port to that Port itself the reason for this has everything to do with how our port and host and all this sort of stuff is being mapped it sometimes works very seamlessly sometimes does not basically the general rule of thumb is if you have the ability to use the same port you should but the reason I wanted to show you all these different ports here is because this is kind of confusing it doesn’t really show you what’s going on you just need to remember the first Port is going to be our Systems Port the second Port is going to be the port that the docker container app is running on great so now we can go ahead and bring that back down and then we’ll go ahead and bring it up in just a second okay let’s bring it back up and here we go so we got Port 802 we open this up and now we’ve got the hello world in there so congratulations you have Docker compose working now this stuff I realize might be a little complicated if you’ve never done this before but really these are just a bunch of arguments that we are telling docker to use for this particular container image that’s being built locally you don’t always build things locally as we saw before when we actually had a python project but when we do build things locally we have a whole another set of stuff that we can do so for example we can change the docker file command here the one that starts the application the runtime script we can change it to something different something like uvicorn main app and then the host port and reload the nice thing about this then is it’s actually going to be built off of that command it’s not going to be using the gunicorn script at all so if you made any mistakes with that you could just use this command to run it and of course you would take it down and make it all work just like that um as it need be so the next thing about the docker compost stuff is you can do something like this where you can actually watch for changes on files and it will rebuild the entire container we’ll see that in just a moment now if you wanted to rebuild the container or if you want to be able to do your own development you’re going to do one more thing which is attaching a volume to this container so this volume here is going to go ahead and grab this folder of source and it’s going to go ahead and put it where we want it to right now this says slash apppp which may or may not be the correct location we want to mount source so this means that we go back into the docker compost file and we scroll down and we see where we copy our SRC folder which actually goes into slash code so we just make a minor modification to this volume here and that means that we are going to go ahead and basically copy our code into the Container constantly and then we’ll be able to rebuild it so this might be confusing so let’s go ahead and see what what I mean by all of this we do Docker compose up d-at so what’s happening here is we see that watch is enabled if I were to change something to requirements.txt let’s go ahead and say bring in requests here like python requests and I just go ahead and save it this rebuilds the container image it’s going to happen right away now this might take a few moments because that’s exactly what does happen when it comes to building out containers but there it is it rebuilt it and then it restarted the application altogether so the next thing is actually changing the code but before we do that let’s take a look in here we see Hello World at Port 802 now inside of main.py if I change this to hello world Earth or something like that we should be able to refresh in here and it automatically changes which is super nice so really the command we want to use from now on is d-at in here so that we can actually just change our code we now have a Docker based development environment all you’ll need to run now going forward is Docker compose up and watch and then if you do need to get into a command line of this Docker image you can do so with simply Docker compose r and then the service name which is in the composed damel in our case it’s simply app so if we come back into our read me here we can just go ahead and say app and then you can put in something like /bin/bash which will allow you to go directly into the command line shell now you could also see something like or uh you could do something just like that but instead of b/ bash you could just go ahead and say something like python right that should actually work as well which will give you the virtual environment version of python so let’s go ahead and try either one of these out so you can take a look at what we’re doing here and then we’ll be able to see that I have this runtime ability in here as well so every once in a while you’ll see this removed orphin you might want to just add that in to a command as well uh whenever you run it but we’re going to leave it like this here is that version if I import OS and then I print out os. get current workking directory I think that’s the command we hit enter and we see that we’re in code if I go to exit it it exit the container altogether so what I typically do is not run python directly but I want to go into the command line kind of like sshing into this container to then run off of that and then in here I will be able to list things out and there’s main. Pi which is quite literally this code right here and so we can actually see that as well by doing cat main.py this will show that code in there if I were to change main up high in here to something different to like Hello World and then let’s clear this out again and then run cat main.py it changes and shows that actual code right there as well so now we have a mostly complete development environment that we can use through Docker um there are other things that we might want to expand on this but if you’ve ever thought about hey I want to use a more even more isolated environment inside of my project this would be the way to go but the main thing about this is mostly to be prepared for production which I think we are well prepared for production now in the next section what we want to do is actually start using another service in Docker compose to actually start building out the data schema that we want to use the key takeaway from this section is that we now have a production ready development environment of course this is thanks to Docker containers and this Docker file right here now the composed file helps us with the development environment the docker file file will help us with the production environment but both of them together give us that production ready development environment now this is more General right it’s not necessarily about our python application which of course means that we also need to make sure that our python application is ready for a local development environment as well which is why we started it there some of you may or may not use the docker composed stuff during development and that’s totally okay the key thing is that it’s there it’s ready and we can test it when we’re ready in other words if you need to just run your uvicorn and activate your virtual environment and run that on your local machine like that that is totally okay and it’s more than acceptable when it comes to the actual Docker compos though we can also just have that running if we need to because the way the actual Docker composed file is developed is it will react to changes that you make with your code or at least sync those changes and a lot of that has to do with all of the different commands we put in here so like we saw when we changed our requirements.txt the entire Docker container was rebuilt and then started to run again thanks to Docker compose watch so this really gives us this Foundation that we can build off of now does this mean that we’re not going to change anything related to Docker or Docker compose no as our application develops we might need to change how it actually runs whether that’s in Docker Docker compos or whether that’s locally just through Python and virtual environments but the key thing here is this Foundation can be used on other python applications if you want and with slight modification to the docker file you can also use it in node.js applications now node.js applications will also follow this same sort of pattern that’s the cool thing about the docker file itself is that it has well very stepbystep instructions that are going on here and of course you could use this to deploy things manually as well so if you don’t want to use containers when you go into production this Docker file will at least help you with that as well so we’re really in a good place to start building on top of our application and really just flush out all of the features we want to use now we still will use Docker compose for other things we still will use the docker file for other things but the point here is we have the foundation ready and that was the goal of this section let’s take a look at how we can start building new features on our application to really make it the true analytics API in this section we’re going to be building out API routing and data validation we are creating something called a rest API which has very static endp points that is predetermined URL paths that you can use throughout your entire project and so fast API is really good at building these things which is the reason we’re using it but the idea here of course is you’re going to need to have something the code going and I’m going to be using my Docker compose version of this with watch running so if you don’t have Docker compose up and going you can use just straight python of course what you’d want to do is clone the repo itself and if you are going to use straight python you’ll create a virtual environment in there and then you’ll navigate into SRC in which case you will do the docker compose command which is this one right here and you would just maybe change the port or you could use the same Port so in my case I have both of these things running and they are allowed to access so either one works the point here is being able to develop on top of what we’ve already established previously and so we can actually build out the API service now there’s one other thing I’m going to be doing here is I’m going to be using something called Jupiter notebooks and I’m going to go ahead and create a folder called NBS here and it’s going to be simply hello-world doip python notebook or iynb and so in here I’m going to go ahead and create some code and print out just simply hello world and then I’m going to go ahead and run this with shift enter this will then prompt me to select my python environment which is going to be my local virtual environment right here this of course means that I’m not using Docker for this the reason I don’t want to use Docker for this has everything to do with how we’re going to test this stuff out and so I’m going to go ahead and install this this uh you know it’s asking me to install the iPod python kernel in here and we want to do that obviously you can add this to requirements.txt but cursor VSS code wind Surfer all really good at actually just running Jupiter notebooks right in line which is why we’re going to use it right here so this is pretty much it for the intro Now using these jupyter notebooks we will then test out all of the API routes we put in and make sure that they’re working as intended let’s jump in the purpose of rest API are really API Services is so that software can communicate with software you’re probably already well aware of this but since we’re having software automatically communicate with each other we want to make sure that we Implement a health check almost first and foremost so that if the software has a problem reaching it it could just go to the health check to see if the API is down or not so that’s where we’re going to start and this is a really just a lwh hanging fruit way of seeing how we can build all this stuff out now inside of Main Pi I’m going to go ahead and copy my read root here and I’m going to paste it underneath everything and I’m going to just change the path to Simply Health with a z health check API endpoints are usually just like that it’s not usually Health but it has a z in there not really sure why but basically what we want to do here is rename the function to something different and we’ll go ahead and say read API Health something along those lines and we’ll go ahead and say status being simply okay in other words we can tell the designers that are going to be using this API hey if you need to just go ahead and run this health check this is also going to be really important for us when we go into production so now what we want to do is actually test out this health check so inside of my NBS here I’m going to keep this hello world going and I’m going to import a package called requests so python requests is a really nice way to do API calls this might be what you end up using in the future there’s another one called htpx both of them are really good and have a very similar way of doing HTP requests now what I see here is I actually don’t have the module called requests or python requests itself so jupyter notebooks one of the other nice things about these sort of interactive environment is I can run pip install requests right here and what it’s going to do is it’s going to use the environment that you set up at the beginning of this one right it’s going to use that virtual environment to install the packages you might need in which case you can just run something like that every once in a while you might need to restart the kernel which you just hit restart and there it goes once it restarts you have to run the various things and at this point python request is installed on my local virtual environment okay so I’ll leave that out for now I don’t actually need it any longer so feel free to comment these things out and realize I’m doing shift enter a lot to actually run each cell itself okay so how do we actually call this health check itself well the way I think about it is my endpoint the endpoint I want to use which is going to be Health Z and then I’m going to go ahead and do like my API base URL which I often do something like this base URL and it’s going to be HTP col Local Host and then the port we’re going to be using in my case I’m using Port 802 now if you’re not aware Local Host is also the same thing as doing on 7. 0.0.1 those two are interchangeable with in most cases okay so endpoint maybe actually is not what I want to do I want to say path here and I’ll go ahead and say the endpoint is the combination of these two things which I’ll just use some string substitution to make that happen just like that so now we’ve got our endpoint here and I can print that out and that’s pretty straightforward now if I could do a command click or control click I can actually open up the web browser and see this right here now of course we want to build towards automation since we’re using our our uh you know our API service we’re going to go ahead and say the response equals to request.get this endpoint right here and then all I want to do is print out response. okay okay so I’m going to go ahead and run this and what it does is it gives me a True Value great if I change the path to something like ABC on it and run those two cells again I get false fantastic okay so I’m going to keep it in as just simply health and this is what we’re going to be doing we’re going to be building on top of this with different paths and different data points and also different HTTP methods now let’s go ahead and create a module for our actual API events so inside of our SRC here I’m going to create a folder called simply API inside of that folder I’m going to go ahead and create another one called events and then inside of there we’re going to go ahead and do routing. so what I also want to do is make sure that each one of these things has an init file in here to turn events into its own python module so we can use dot notation appropriately as we’ll see in just a moment the idea here is we want to have very specific routes for our events resource so if you think of main.py this is kind of generic routes what we really want to have is something like SL API events and then that being all of the events stuff that’s going on in here so very similar to main.py but just slightly different in the way it’s termed so let’s see what that looks like so we’re going to go ahead and do from Fast API we’re going to go ahead and import API router and then I’m going to go ahead and declare router equaling to the API router now this router right here here is basically the same thing as the app but it’s not an app it’s just for this one small portion of the larger app and so in here then I’m going to go ahead and do router and we’ll go ahead and do.get the HTTP method of get which we’ll see in just a moment and I’ll just put a slash here and then we’re going to Define and this is going to be something like you know get or let’s say read events eventually in here that we will have some sort of more robust response and I want to return back just let’s go ahead and return back in a dictionary of items and I’ll just go ahead and say one uh two and three now the data that’s coming back we’ll definitely look at and change over time but for now we’ll just say read events like that so in order for this to work what we need to do is we need to bring it into main.py because right now the actual fast API application doesn’t know about this routing module because we have nowhere it’s not orted anywhere right this right here has all of the definition that we have in place at this point so the way it works then is we need to import it from API so we’ll go ahead and do from API and then the fact that it is a python module because of this in it we can do do events like that and then I actually can also then go one more do notation and import routing and then we can import the router that comes in and then we can change it as something like as event router now the reason that I do as event router is so that I can just riable name this if you will um so that I can have a lot of different ones with the same sort of organization or the same sort of setup that we might have had before now that we’ve got that we can come in to our app itself and we can use do include router and we can just pass in that event router but as you recall the event router itself has a slash here but we really want this to be SL API SL events like that so this is the prefix we want to use for this route now of course the way I can actually use this path is I could set it right here and that would generally be okay it would technically work but what I actually want to do is go back into main.py and set the prefix itself to that actual prefix without the trailing slash okay great so that is now an API endpoint route that we can use of course we still need to test these things out and we will but before I go much further I will say that this routing is something I actually want to change the way we change that is by jumping into init.py and in here I can do from routing import router and then I can just go ahead and exported by putting it into this all list and we just put in router just like that notice it then gets highlighted and now I can just come back into Main and I can get rid of routing and just imported that way it’s just a subtle way to do that it’s a nice thing that those init methods are able to do now the init method in here I could probably also import that router in this init method as well I don’t want to do that I just want to keep it like somewhat isolated from each other um or somewhat packaged in this way but now that we’ve got this router what we can do of course is jump into our API endpoint and just check it out by going API and events and there we go items 1 2 3 since we have this API Ino let’s go ahead and create a notebook to verify it so I’m going to go ahead and do one- verify router or let’s say API Event Route something like that IPython notebook will bring this in here now I’m going to go ahead and continue to number the notebooks themselves so you can always reference them later but the idea here is we want to do something very similar to this hello world where we’re going to go ahead and import the requests here so let’s go ahead and open up new code then we’re going to go ahead and bring in all of these inpoint stuff in here so we can do some various tests itself now I want to run this with shift enter it’s going to tell me what environment I want to select again we’re going to use our local virtual environment and then our endpoint our actual path is going to be a little bit different than this and that of course is going to be API events and we can have a trailing slash in here and then I can do my response equals to requests.get and then that path itself now the.get method here is correlated directly to this git method right here we’ll talk about that more in a little bit but for now I’m going to go ahead and run these two things and what we should get is the actual API endpoint but of course I did path not endpoint so that’s a little slight little mistake so we need to change it and then that gives us back a okay response which we can now print out with response. okay and there we go we got a True Value there so what we can do then is say something like if response. okay like that we can actually print out the data that’s coming from that response by doing response. Json like that and then we can print out that data okay so we hit enter and there we go we got items 1 2 3 now the reason we’re doing response. Json is because this is now a dictionary so if I actually do type of that data we should see that it is a dictionary itself right oh this is not dot but rather comma we can see that the class is a dictionary which in other words means that I should be able to do something like data. git items and do the dictionary value stuff that you might want to do now of course if we change our API endpoint this might have some issues in here as in it items might be none so if we actually did change our API end point let’s change it real quick to being you know uh results something like that slight change made it happen real fast I refresh in here this is still the same data because I didn’t actually do the res the request again so I didn’t actually hit the API endpoint again now when I do hit it it gives me something a little bit different in terms of the actual data that’s coming back so this is one of the main reasons why it’s really important to think through all of those API end points in the beginning you want to make sure that these stay basically the same going forward that’s why we’re going to be testing them out first to make sure we’ve got exactly what we want then we can always of course improve it later but something is simple as just returning back results here could have a drastic effect on somebody who might want to use your API itself even if that somebody is just you let’s keep going we want to take a look at the impact of data types on our rest API now what we saw in the last part was when I changed the key value of this data from items to results the actual notebook itself had a different result altogether and so I actually copied that notebook and just made it into one or two cells here what we run now is we can see that result still so what I want to do though is I want to take a look at how this impacts a single API result itself so in other words I’m going to go ahead and copy this and we’re going to call this instead of read events we’ll call this get event like a single event itself and then we’re going to return back what that single entry might be so if you think of this in terms of a single row and this being a you know list or a bunch of items in a table more on that soon but the idea is we want to look at this single row here now the way this works is we put in something like event ID so this is going to be a you know a variable that’s going to be passed after that endpoint here that variable will then come into the function itself so whatever you name that you put it in here and of course if you declare it as a data type like int then it’s going to look for a specific number we’ll see that in a second and then of course we can return back the ID being that event ID okay so back into this new notebook here I’m going to go ahead and pass in let’s say 12 in here for example and I’m going to go ahead and run this and what I should get back is exactly this data it’s still a dictionary but it’s now returning back ID of 12 if I change it to something like a and then try to run this I get nothing back in other words I can print out the like you know okay and then do response okay and I see that it’s not okay and I can also open up the URL itself and we see that it says un able to parse string as an integer because the input was a the URL everything like that we’re seeing that the input is a so there’s a couple ways on how we can solve this number one we could change this back to being a actual number because it’s an invalid request number two if we did want to support an a we would change this from a integer right here a very specific data type to string which is a little bit more generic in the sense that we can then you know now support that API inpoint for that string now in my case I want to keep it back as an integer because that’s how we’re going to approach this inpoint itself which it requires it to be an integer all across the board which means then our notebook will also require that as well but we’re not quite done yet that’s the input that’s the data coming through what about the data going back like what if I change this to being ABC ID and run it that way what we end up seeing here is we will actually still see the data come back and in some cases oops let’s make sure we save that and then run it again and there it is now this is now the output data it is now incorrect so we need to change that to something different that something different is going to be called a schema which is basically just designing the way the data should flow in this case we’re going to be using pantic so we’ll go ahead and create schemas dopy and I’m going to go ahead and do from pantic we’re going to import the base model and this base model is going to help us a lot so if we look back into our requirements.txt we did add pantic in here pantic itself most likely will be in with fast API as well so we’re really just using the basics of pantic here and we’re going to go ahead and say the event itself is going to take in that base model and this is where we’re going to design how we want it to be returned in this case we’re going to do an ID which has an integer so that’s the schema that is going to be used this is also a lot like data classes but it’s going to end up turning into something along these lines where it’s a dictionary that’s coming back so that’s kind of what’s expected by this schema now the way this ends up working then is back into our API routes we can import that schema so from schemas import the event this is probably better than calling it event we would call it something more like event schema so let’s call it that you could call it model but we’re not going to and you’ll see why when we start using the database stuff database models is how I typically call models schemas like the design that might go into a database model or come from a database I’ll call schemas okay nevertheless we now have this event schema in here and it’s in here as well now what we can do is we can actually just return that data type so if the colon and then the int is how we declare the incoming data type this arrow and then the data type is how we declare the outgoing data type the response data type itself and so as soon as I do this what I should probably see inside of my fast API application or at least very soon I should see some error with that data type okay so let’s go ahead and do a request here and I’m going to go ahead and run it in two places so I’ve got my notebook here and then my fast API application running below as soon as I run that I see that it’s missing the response ID field required the input again is ABC ID of 12 so the input’s incorrect so all we need to do then is change our route itself and that is going to be of course ID or basically to match that schema from before and now we can run everything again and this time it actually ends up working great so now we’ve actually made our system a little bit more robust it’s a little bit more hardened because it’s harder to make a mistake now or that mistake will be a glaring problem here so if I put in ID that is possible you make that mistake on accident but as soon as you go to try it out you’ll get that same error like what is ID don’t know got to fix that and we’ll go back okay great so then we bring it back just like that Co cool so of course solving errors also comes back to using something like git so tracking the changes over time will become important if you were using something like this if you’re building out a rest API of course you’re going to want to use get I’m actually not covering that because it’s outside the scope of this series but it is something important to note right now because git would help identify that problem because then you could just do something like get diff and of any sort of thing if it was already in your database so let’s take a look at that right now or rather in your data itself so I’m going to go ahe and add it and we’ll go ahead and do something along the lines of the actual name itself which I believe I called it this will be something like uh five and we’ll do basic data types there we go and so now I’m going to go ahead and do that slight little change here save it and now if I do get status I can see that there was a change and now I can take a look at that difference and what I should be able to see uh oops not schemas but rather the routing itself so get diff of that I now see exactly the problem that’s coming through so this would be another way to catch it and hopefully catch it early on of course you can also use automated testing using something like Pi test to actually catch that as well which would be even better to harden the system but the point here is we now have a way to validate a single piece of data how about a list of data let’s take a look now that we can return reliable data for a single item let’s go ahead and look at how we can do it with a bunch of items so there’s a couple different ways and how we could think about doing this one of the ways would be to do something like this where we just bring back a list you can just use the list element just like that and then put brackets there and put in that event schema that is a way to return back a list but that actually changes the result schema from a dictionary to a list so you can’t exactly do a dictionary like this that’s not really how it works so what we would end up doing is we would actually bring in a whole new schema for the result results themselves so what I would do then in here is inside of schemas dopy I would do something like maybe the event list schema in which case this is now going to be results of the data type itself which will be a list of schema right and we could also bring in the other class from typing so we can do from typing we can import the list class and we can use that instead of the list element so this is a better data type decoration itself instead of doing the python built-in list both work but this one is more verbose okay so now that we’ve got this we’re going to go ahead and bring it back into our routing and we’ll go ahead and use that list event schema here and we’re going to go ahead and return back that a list event schema and just like that we don’t need to change it at all because inside of this schema we’ve got this results here great so the actual result itself well let’s actually take a look at that one as well so if we save this I’m going to go ahead and duplicate this basic one and now we’ll do list data types or something like that we’ll go ahead and come in and change this to Simply events and we will update this and let’s go ahead and run it now and we get false once again okay so if I open this up I get an internal server error and what’s likely happening is the actual input is incorrect so each one of those items has invalid input data itself and that’s because the actual data that’s coming through in here is not an event schema so we would actually want it to be at least of that type of event schema in other words it needs to look closer to this for each instance in here the way we do that then is actually turning these into dictionaries themselves so ID equals to that ID and then we would just repeat this process for each element which I’m just going to go ahead and do really quickly and of course I probably could have just copied pasted but there we go so now the results are dictionary values and we can see if that solves the problem for us it looks like the application rebooted and now we can go back in here and we can run it again now the results are coming back so this is actually super nice so that means that our routes themselves are now hardened to what we want for our results and of course if we wanted to add something to this list scheme I say something like count and we’ll go ahead and do count being something like three three right then we would go back into our event list schema come in here and say count and this is going to be an integer H here uh we could also have it as an optional value but we’ll just leave it in there just like that and then we’ll come in and run this again and now we’re getting false once again so let’s see for instance uh we don’t have the results coming back correctly so let’s go back in the route we’ve got our count in here uh this actually should work just fine let’s make sure that everything’s all saved up and let’s try that one once again with it all saved and there we go so now we’ve got the data coming back here it is with that count in there so the point of this of course is to make sure that how we’re designing our API when it comes to communicating the data is going to be very hardened it’s not going to change a whole lot altogether so of course this is going to be even better when we start entering this data into the database uh and also ex and actually grabbing the data from the database but before we even start grabbing the data from the database we need to learn how to send data Beyond something like the URL route in other words we need to take a look at the postp put and Patch methods to see how we can send data to our API let’s take a look up until this point we’ve been using the HTTP git method this becomes even more obvious when you look at something like requests.get and then inside of main.py we’ve got app.get and then we have a URL path and then in routing. we have router. git so the G method is very very common so we’ve got git like this we’ve got get down here for like some ID those methods are what you’ll use to grab data from the database or from your API itself and so our notebook is really just a small example of that this notebook is really meant to be like as if you were building out another app to work with the API we’re also building at the same time but now what we need to do is we need to be able to send data back to Any Given uh you know server or to our API so what we want to do is we want to use the post method that’s what we’re going to now is we’re going to use that post method in here and so the idea then is we’ve got our endpoint all of that stuff can be the same to change the response here all we need to do is change this to not get but rather post so now this is going to attempt to send data but in this case it’s actually not sending any sort of data at all because we well we didn’t Define any data to send so if we actually look at the response itself so I’ll go ahead and do response. text and we can see detail method is not allowed as in the HTTP method is not allowed if we do get and we see that as well we can see that it actually unpacks two different things so we’ll look at this as well in just a moment but the idea here is going back into the post method we get a detail not allowed that not allowed error is because of our routing here so we don’t actually have a route to handle the HTP method the only routes we have right now are to handle git Methods at these particular endpoints if we want to handle a post method as well we can just duplicate our function here and then just change it from router. git to router. poost which is now going to be this as an send data here type of thing and then this instead of read event it will be something like create event now the important part to note is the endpoint is the same as each other but the method is different so inside of fast API we create a different function to handle the different HTTP method as well as the different HTTP route sometimes what you’ll see on AP apis is something more like this where it’s create and it’s quite literally a different route as well to handle the different method now that is not nearly as common in the case of creating data so we’re going to go ahead and leave it in as this right here so we’re going to get data this is going to be more of a list view this is going to be more like a send data here or a create view right and so let’s go ahead and get data here type of thing right so the idea then is in this function when we go to create an event the response should be very similar to the detail view because when we’re going to add data into the database what we could could return back is hey we added that data oh yeah and here is the data we added because if we modify things inside of this function which we might then we want to send back whatever that modification would end up being so in my case I’m going to return back the original event schema and we’ll just go ahead and say we’ll just do an arbitrary ID here because we don’t actually have a database that will help us with that ID just yet okay so we’re almost there now that I’ve got the actual inpoint let’s go ahead and try it again because now the method should be allowed so let’s go ahead and run it again and now I get the method being allowed and then the response is just echoed back in here so the post method is basically the same as G method at this point it’s just echoing back some data but that’s not what we want to do we want to actually send data here we want to actually see some data so I’m going to go ahead and say data and I’m going to put this equal to an empty dictionary in here and I want to actually print out that EMP empty dictionary or rather data dict equals to empty dictionary rather so I’m basically declaring the data type and then an empty dictionary in here so we want to see what that data is and so this is actually very similar to what we saw down here where we passed in an argument into the route a wild card if you will into the route it’s very similar to that but slightly different okay so let’s go ahead and take a look at that and we’ll do this by sending that event again and if we look in here we see that there is no data coming through so there’s that dictionary okay so if I actually want to send data though I need to change things a little bit before I do though I’m going to go ahead and jump into the response and grab the headers that are coming in here and so what we see is a bunch of headers but the important part is the content type here so what’s happening on the server side in our fast API application is is expecting a certain data type to be sent especially with sending out data in other words the data itself is going to be a Json data type so I’m going to go ahead and say something like our page is equal to you know Test Plus or something like that so that’s the data we’re going to send now and we’re going to go ahead and say data equals to data but there’s a caveat here so if I actually run this now I will get an error right so it says the error there and of course I can say something in here something like else print out the response. text so we can see what’s going on with that response and what it shows me is input should be a valid dictionary and here is the actual input that was sent so that is not Json data we can know what Json data is in a couple of different ways so one of the ways we do this is by bringing in the Json package that’s built in to python itself and then we can do json. dumps and put in data there hit enter and now we see this string in here this is actually quite literally a string which you can check by just putting type around it and you can see that as a string it’s no longer a dictionary like what we have up here so so in other words this is string data but it’s in a very specific format that format is you guessed it Json data and so we can send this data now to our back end so let’s go ahead and try that so I’m going to go ahead and use instead of just pure data here I’m going to go ahead and grab this Json dumps pass that in run it now this time I didn’t get that error it actually did send that data end and we can actually take a look at that data with this right here so what ended up happening is this data was treated as Json still even though I didn’t tell it to treat it as Json so one of the other things that you do every once in a while is you create headers and you declare what the data is that’s coming through in this case it’s application Json this is not the only kind of data so for example if you were sending a file like an image file you would not use application Json you’d use something different but the idea here is you can pass in headers in there as well and send that same data these headers are going to send to our back end our actual API will look for those headers and be like okay cool I see you headers you say application Json so I can treat this data as if it were Json but in in the case of fast API it actually unpacks that Json data for us and turns this into an actual an actual dictionary which we can verify by coming in here and saying type of what that data is and then we can come through in here send that and there we go we’ve got a dictionary in here so in other words what’s happening here under the hood this is all happening for us but we still need to understand what’s going on when it comes for apis communicating with each other it’s usually through Json data it’s not always there’s other data types that you can pass but it’s usually through Json so fast API was designed in a way that said hey when we skit post data we’re just going to expect it to be Json so then we can just infer that the data that’s coming in is going to be a dictionary in other words I should be able to respond with this data assuming I did it correctly but of course the schema that I have set up won’t allow for this data to respond as an echo so I’m just going to get rid of this schema for just a moment so we can see the Echo come through now so with data test I’m going to go ahead and run this all again and now we should see that Echo coming back as soon as I bring that schema back it is going to run an error as we’ll see in just a second so if we run that back everything’s saved run it again what I get is now an error because the schema is invalid so we need one more step with our data and that is going to be an incoming schema in other words this can’t be generic like this it needs to be something else so for now I’ll leave it like that and we’ll look at that incoming schema in just a moment now I will say one of the cool things about this as well is we can go even further I can copy this router here down here and we can grab something like put as in we are going to go ahead and update this data now I’m not actually going to cover patch but put is very similar to this where we have the event ID coming through and then we also have the actual data that would be coming through from the back end this actually would be not called Data but rather payload that’s what we would end up calling it for the put events so we’ll see this one as well but in this case we are just returning back this here and this would be something more like update event now in the case of an Analytics tool I don’t know why you would be updating the event to directly instead of just entering a new event but this is kind of what we need to understand is these different methods now there are other methods as well that I’m just not probably going to implement for a bit but one of them being like delete where you would delete a specific event itself that would be another one that you might end up using once again we’re not going to be implementing that just yet but now we want to actually validate the incoming data to our API we need to ensure the data that’s being sent to our API is correct or at least in the correct format so very similar to when we send data back needs to be in the correct format in this case as the event schema we need to make sure that the data that’s coming in is in the correct format not just in the correct data type so not just a dictionary for example so what we want to do here is create a new schema that will help us with this validation now we already saw some elements of this validation with the git lookup where we said event ID is an integer we want to make sure that it’s only a number of some kind you could obviously change it to being an Str Str in which case that actually sort of breaks this endpoint where it’s now a wild card it’s no longer an actual number so we want to keep that one as an integer the same sort of concept exists for our create event as well so what we want to do then is we want to jump into our schemas here and I’m going to go ahead and create a brand new schema this one is going to be called our event create schema this one will no longer have an ID in there but rather it’s going to have a path in here which for now will just leave as a string value and so this schema itself is now going to be brought into the router in here so we bring it in just like this I’ll go ahead and pass it in here and we’ll do something along these lines right here there we go and so now this is going to be our schema so this is the data in here all I can do is bring it in as create event schema or the event create schema and then I would want to actually rename this to Simply payload you could keep it as da data you could keep it as a whole lot of things but I actually want to keep it in as payload I find using the term payload to be a little bit better than data data is a little bit too generic for what this kind of data is payload is what’s being sent to us right it could be correct data but realistically it’s payload just like what we started talking about down here now the other thing is with the put method we could could do the same schema if we wanted to or we could expand it if we wanted to add something different so let’s say for instance we came in here and did the event update schema and instead of updating the path maybe we just update the description and that’s literally the only field that we allow to be updated in an event update schema and so these are required and the only fields that are changing so what that means then is I’ve got three Fields here that the event could potentially save that is going to be these three Fields right here right so we would have ID we would have path and we would have description so this is what would actually be stored on the database where the actual endpoints are only supporting these items in here but the actual event schema the final one might have all of them as well so we’ll talk about that when we get to the database stuff uh but now that I’ve got both of these I’m going to go ahead and bring that one in as well and we’ll bring it in to the update portion also so let’s go go ahead and do that just like that great and this time I’ll just go ahead and print out that payload also so I want to see these two events and see what they look like going back into send data to the API this notebook here I’m going to go ahead and run this and what we should get back is we’ve got an issue here for our input right so we’ve got page in here and the input saying Page Field required being path not page so that means that I need to change this to being path or I need to change the other other schema being page so I’m actually going to change the schema because maybe I don’t want it to be called path maybe I want it to actually be called page and so in which case I would save it like that and then we go back into the API call in the notebook and now it actually sends it back to us which is great so of course we could always Echo the data back as well if we wanted to by sending or updating the event schema payload altogether we’ll look at that in just a second but what I actually also want to see is this update view on how that might work so going back and here we’ll go ahead and copy some of this data here paste it below this time I’m going to go ahead and grab the endpoint stuff and it’s going to change slightly uh which will be this down here and then we’ll say this is 12 now with a trailing slash because of how our endpoint is set up and so there we go and I’ll call this my detail inpoint so detail endpoint and then detail path and we probably don’t need this base URL anymore and then we’ll go ahead and put this down here now you might be wondering what method do we use hopefully you remember it’s simply the put method that’s it okay so now that we’ve got this there’s one other thing that I can do in here that’s actually really nice so instead of doing data equals to Json dumps data and then the headers I can literally just use Json itself python requests will then Implement all of those other things for me and what we said was description is the value that we need here in which case I’ll just say hello world and then now this of course is an HTTP put I now can run this and I should be able to see that it is okay and if I look in the terminal of our application running I see there’s the description Hello World and there’s the page and of course back into the route this is not a dictionary now but it’s rather a schema so if I wanted to get this stuff I could go ahead and do payload page and then in here it would be payload uh what did we call it maybe that one was page yeah so that one’s page the other one’s description so this is payload do page this is going to be payload do description giving us the full advantage of using pantic in our API and so back into the notebook itself if I run that again I see that it says hello world for that description and then the other one should probably say something like uh the path itself so let’s go ahead and run that again and there it is cool so we now have the routing and the schemas for incoming validation as well as outgoing validation to ensure all of that stuff’s working with pantic we can have optional fields at this point we don’t have any optional Fields we’ve got only required Fields so let’s see how we can actually create these optional Fields now before I do I want to actually update my notebook here a little bit so that the create endpoint and the update endpoint are basically one sell by themselves so that there’s no issues with it and so the create one is going to be here I’m going to just change this to create endpoint and we’ll go ahead and put that there the data will actually turn this back into Json data and we’ll pass in the dictionary with the page of you know slash Test Plus or whatever uh this is mostly so we don’t have any conflicts with some of the variables that might come through from the different responses and then also we only have to run uh each one of these one time right so get update great so now that we’ve got this let’s go ahead and add in the optional data here so jumping into our schema what’s the optional data I want is in this event schema I probably want to have page and description optional I’ll start with just simply page and so the way this works is if we put it in as page colon string that is required data if we want to make it optional we bring in the optional class from typing and then we actually bring in the square brackets like this now it’s optional or at least seemingly optional so let’s go ahead and save everything jump back into our notebook and then run it I get an internal server error so if we actually take a look at that server error what we’ll end up seeing is the field is required right so it’s now still assuming that that field is required so it’s not quite optional yet uh so let’s go ahead and add in a default value for it so it’s almost optional just not quite there my default is going to be just an empty string here just like that so let’s make sure it’s saved and once again back into our notebook we’ll go ahead and run that post and this time it seems to have worked and there we go okay so what I want to do is actually Echo back that response I want to see that data I don’t have an error anymore with that field but it doesn’t seem to be showing up much right so if we look in here we’ve got the response coming back let’s actually take a look the response coming back is Page being an empty dictionary so that’s actually kind of nice in the sense that yeah cool the page is there if we look at the other one same thing the page is there it’s empty great so what I want to do then is I actually want to return back the page data at least from the first one and the way we do that is by jumping into routing and then putting in the page data in here and we can do that with the page value being added to the response value so payload dopage we save everything like that we go back into our notebook we run it again and now we can see that page values coming through and now it’s technically an optional value for this field right for that data and we can do that same thing again with with the description itself and we can come in here just like that and now we’ve got some optional values in here and if I look at my requests there we go and we can come in and we see all that great so of course I want to Echo back the description back here as well so back into our rounding here we’ve got our payload down here which has our description and then I can do something along these lines where it’s pay. description great so this will help us with that echoing and of course we can verify that again and there’s that description coming through of course it’s just echoing what’s ever here and that’s that okay so in the long run of course these schemas would be tied into what’s happening in the database so in the middle of all this stuff would be sent to the database and then receiving stuff from the database but the other thing about this is the actual payload itself we can actually return it to be a little bit easier on ourselves so now what we can do is do data being payload do model dump and what this does is it takes the payload and turns it into a dictionary which is basically from pantic itself has that model dump right and it used to be something different I think it was like two dick maybe uh like this if I remember correctly but now it’s just simply model dump in which case this is now going to be a dictionary in here which I can unpack just like that and I would do it also down here as well so we would unpack it like that and then the data just like that great probably don’t need those print statements anymore uh but now we’ve got the data coming through as we probably want okay so once again I can verify these in my notebook and I should be able to see the echo data coming through in here and one of the questions of course would be oh well could I actually send the description now if I wanted to so if I come in here and do description and set something like you know abc123 will that respond that data back and the answer right now is no hopefully you understand why at this point but I’ll let’s go ahead and go through it first off the data that’s coming through is just this payload right here it’s going to ignore that other data if you pass in more data in here it’ll just ignore it there’s probably a way around that as in it will then say Hey you have too much data there’s probably a way to do that but for us we really just want to make sure that we’re grabbing the correct data and so I can actually come up here and give that same optional description in here and also on the update view if I wanted to be able update the page I can do that as optional as well but there we go so now that we’ve got the description coming through I should be able to Echo that back as well and there it is it’s now being echoed and now I have a way to change the responses really quickly and it all has to do with these optional values if you wanted to add in optional values now every once in a while you might want to add in a default value this default would be then using a field so you’d come in here and say something like field default being an empty string or my description something like that in which case you would come back into the API let’s go ahead and get rid of the description I passed in and now we’ll take a look at it if I run it again it now gives me that default value if you wanted to put one in there um and of course there’s more verbose ways and more robust ways we can continue to modify that stuff uh but overall this is actually really nice because I have the ability to change all sorts of things in here for for my schemas and now once again adding to that stability to how we end up using pantic inside of fast API our API now has the ability to handle different HTTP methods like getting post at different API endpoints like our list view or our detail view now once we actually can do that we also see that we can validate incoming and outgoing data now a big part of the reason I showed you schemas in this way like with multiple Parts is because the way you use different data pieces might change depending on your needs for your API service when it comes to an analytics API do we really want to ever update it maybe maybe not this is going to be something you’ll decide as you build this thing out more and more the point here though is we have a foundation to move to the next level that next level is going to be using SQL databases to actually store the data that’s coming through and be able to handle it in a very similar way that we’ve been doing and the way we’re going to do that is by using SQL model if you look at SQL models documentation and you actually take a look at how it defines a database table it looks like this this hopefully looks very familiar to what we just did and that’s because SQL model itself is based off of pantic it’s also powered by pantic and something called SQL Alchemy SQL model is definitely one of the latest Cutting Edge ways to work with python and SQL SQL of of course is designed for storing data it designed much better than python would be python itself is not a database SQL is and so we’re going to use SQL model to actually integrate with a postgres database called time scale DB and another few packages to make sure that all works really really well let’s take a look at how to do that in the next section in this section we’re going to be implementing SQL model this really just means that we’re going to be taking that piden schema validation for validating that incoming or outgoing data and we’re going to convert that so that it actually stores into a SQL database SQL model will do all of the crud operations that’s create retrieve update or delete of that data into that SQL database now the SQL database we’re going to be using is postgres postgres is one of the most popular databases in the world for good reason one of the reasons is the fact that it has a lot of third party extensions to really pick up where postres Falls flat one of the things that postes doesn’t do very well is realtime analytics or just a lot of data ingestion postris doesn’t do this natively by itself in which case time scale picks up the slack in a big way so we’re going to be using time scale because we are building a analytics API and this is designed for real-time analytics and it is still postgres so at the underlying technology how we’re using SQL model is going to be the same regardless of which version
of postres you use especially in this section where it’s really just about using SQL model itself we will use the advanced features that time scale gives in the next section this section is really just about understanding how to use SQL model and actually storing data into a postc database instead of just using pantic to validate data because that doesn’t really provide that much value let’s go ahead and dive in on our local we are going to be using Docker compose to spin up our postgres database we’ll start with the configuration for Pure Old postgres then we’ll upgrade it to time scale and all of this is going to be done using Docker compose in just a moment now if you skipped Docker compose you can always log in to timescale decom and get a database URL and just use the production ready version right in there that process we will go through in the next section but for now I want to just do the post one on my local machine now a big part of this is looking for the postres user and all of the different environment variables that are available so we’ve got postgres password user and DB those are the main ones we want to look at to get our local version working inside of Docker compose so I’m going to open up compose diamel here and then we’re going to go ahead and tab all the way back make sure it’s on the same level as app one of the ways you can figure this out is by just you know you know breaking it down like that and then you can do something like DB we’ll call it DB service in here and then I can go ahead and create those things out with this still broken down having it broken down makes it a little bit easier to know exactly what to do now the first thing is we can come in and decide our image so if I were to use postgres itself I would come into postgres official and then I would want to look for the tag that I’m going to end up using in here and be very deliberate about the tag especially with post cres as a database so maybe you use 17.4 Bookworm or just 17.4 the size of them looks about the same so it really probably doesn’t matter that much between the two of those so you could do something like this postes 17.4 next up you would want to set the environment which will be all of those environment variables so once again searching for that postcg user in the documentation on the read me let’s go backup page here and search for that and you can actually see the different environment variables so you’d come in here and do something like postc user and then the next tab would be postres password and then something like this and then the actual database itself the core values here are basically this right so those are the key ones that we’re going to want to have next up what we’re going to want to do and this is one we will definitely need is something called a volume this volume is going to be managed by Docker compos we don’t manage it oursel it’s not quite like what we did with our app where we actually mounted the folder here into the Container instead we’re going to let Docker or actual Docker compose manage the volume itself so before I even declare that I’ll go ahead and do volumes and this one I’m going to call this my time scale DB data and just like that that’s all you need to put in and then you can use whatever you name this and you can bring it up to your volumes here and then have it go to a specific location which in the case of the postgres location the postes database it’s going to be VAR lib postgres SQL and then data and so that allows your postgres instance on Docker compose to persist data because you definitely want to have that so if you take down this or run Docker compose down all of this will still be up and running so similar to like our app itself we also want to declare the ports in here and I’m going to go ahead and use the default ports which is 5432 and then 5432 and then in some cases especially with what we’re going to be doing you might want to expose the database which is going to be 5432 I’ll explain that when we start to integrate this into our local project here okay so the idea with this is this is using postes of course right so if we actually scroll back into postgres there’s the name of it all that it’s all good we could totally go that route but I actually don’t want to use postes I want to use time scale so time scale is not a whole lot different but the thing is we grab this right here so now it’s going to be time scale that’s going to be the repo name or the image name and then we grab the actual image tag which we come in two tags here and we can actually scroll down actually in the overview here we also see the tags the notable ones latest pg17 so in the case of postgres itself you got 17.4 with time scale we can just do latest pg1 17 and everything else is the same which is really nice but of course I’m going to change this these values in here to something just a little bit different and I’m change the user to time- user and then we’ll go ahead and say time PD uh PW and then I’ll go ahead and do my time scale DB in here as the actual database name that I’ll end up using so all of those values are going to allow me to do something like this in myv which is going to be my postgres SQL plus pyop G which we’ll go over again once we implement it but it’s going to be our username so it’s going to be just like that and then colon our password like that and then at some host value which will come back to then the port value which is going to be right here and then slash the actual database itself so all of this is what we’ll need in our environment variables for our app itself so the app we can add it into up here we could say something along the lines of database URL equaling to basically that data now in development this is generally okay in production you’re not going to want to have your actual environment variables exposing something like this and so in which case you would use an EnV file as we discussed before but that’s when you would come in here and go into EnV and save it kind of like that now as we saw before as well when we were doing it with our app whatever you hardcode in here that’s going to take precedence over the EnV files themselves which is kind of nice but there’s our database URL and of course it’s based off of all of this stuff but there is one key component that we still need to change in here and that is this host value here this host value is going to be the name of the service itself when you are using it inside of Docker compose I’m going to show you how to use it in both places but I want to leave it like this for now we’ll save this and then inside of my Docker compose all I have right now is my app running so I am going to need to take this down and we’re going to go ahead and restart our app itself looks like I’m having a few issues with our app running so I’m not sure if maybe it’s related to this Docker compose rebuild it seems that it is now with the app down I’m going to go ahead and run Docker compose up and I really just want to see that our postgrad is a database is up and working our DB service is being created and downloaded and all that no surprise there it’s actually grabbing it from time scale the docker Hub version of the time scale DB of course it’s open source so that’s really nice and I’ll let that finish running okay so it finished running and it looks like everything is working in here we might need to test this out with postgres directly but in our case we’re just going to leave it as is and we’ll test it directly with python but if you are familiar with postgres you will want to probably test this the other thing I just want to make sure that my app is still running it looks like it is so I think our Docker compost stuff is probably good and we’re now ready to start the integration process within our app itself self I’m going to be integrating it in two different ways one is by using the docker compos version which is basically this right here another one is just considering our database as Docker compos so that our app itself will be able to run with things as well so in other words having basically two environments that can still work with this Docker compos instance of our database itself now one of the things that’s also important to note that we will be doing from time to time is we’ll go into the root of our project and we will run Docker compos down- v-v will delete the volume as well which means that it will delete all of the database stuff as well too the reason we run this especially in development is to get things right to get it working correctly then once it is done deleting we can bring it back up and everything is back basically from scratch when it comes to development this is a key part of it is making sure you can bring things down and bring bring it back up and it happens that quickly which is another reason to really like using something like Docker compose of course one of the bad things about it is you might delete a bunch of test data that you didn’t intend to delete which of course is something you want to think about if you are going to run Docker compos down one of the first steps in integrating our database is to be able to load in environment variables now when it comes to Docker compose directly the environment variables are going to be injected in other words we can come into SRC here and let’s go into one of our apis into one of our routes I should be able to go in here and go import OS and then just come on read events and I’ll just go ahead and print out os. Environ dogit of the database URL itself and so in which case I should be able to go into uh you know slash API SL events hit enter and I should see that print statement come through and there it is right there okay great so that’s being injected into our application because of this right here but unfortunately or fortunately this won’t necessarily work if we bring in our application directly from our virtual environment so let’s go ahead and do Source VV bin activate and then I’ll go ahead and navigate into the SRC here and then I’ll go ahead and run this same thing right here but this time I’m going to go ahead and not specify the port because it’ll give me most likely Port 8,000 so so now I can open that one up and I can go to API and events and hit enter the print statement is going to be none so we need a way to kind of handle both scenarios some of you might want to use Docker compos when you’re developing I know I do a lot of times but I also know that I also don’t do it a lot of times sometimes I don’t set up my apps this way so our actual fast API application needs to be able to be ready for both of those things so inside of requirements.txt this this is where we need to change something I’m going to add in something called python decouple now there are other ways to load in environment variables one I think is called python. EnV but I like using python to couple for a reason as you’ll see in just a moment so the idea here is now inside of my API I’m going to go ahead and create another folder in here we’re going to call this DB inside of DB here I’m going to go ahead and create that init method and then I’m going to also going to create config.py and what I want to do of course is using pip install python decouple I’m going to go ahead and do from decouple import config and then we’ll call this as decouple config mostly because this module I just created is also called config we don’t want to have any weird Imports in here now as you notice cursor is saying hey decouple is not installed so we need to make sure that we do have that installed in our virtual environment which I’ll do with that pip install there I already added it into requirements.txt so I don’t necessarily need to install it again through that but with this running now um I can reload in here and now I’ve got this decouple config in here so what I like to do is I like to put in our database and URL and this is going to be equal to decouple config and that’s going to be that same database URL and the default is going to just be equal to an empty string here so this is why I like python decouples we’ve got this empty string here the other thing about decouple config is it can actually load in a EnV file as in this right here so we’ve got this host value here so now hopefully if we did it correctly we should be able to use this now the way we use it is going to be from our events we now instead of using OS we’re going to go ahead and do from. db. config we’re going to go ahead and import the database URL or better yet I think we should be able to do from api. DB config Maybe grabbing database URL that way but we’ll see if that ends up working so let’s go ahead and take a look with that print statement now and going back in here we refresh and now we’ve got none from os. Environ but database URL is actually working inside of here which is giving me a different host value but that host value is based off of the EnV file so this actually brings me back to what I would actually do with Docker compose I would not put it hardcoded in here but instead what I would do is I bring in another EnV file like env. compose and use those same parameters in there and I’m going to just go ahead and put them in directly like that and then we would use maybe that same port value um it’s completely up to us in that term but we’ve got this database URL here and then in compose yaml the EnV file itself would be do compose and then I would just go ahead and completely get rid of this environment itself the only reason I’m commenting it out is so that well we can actually bring it in as need be okay so what we see here is we got this you know no module found error I think maybe I didn’t save something or whatever but let’s go ahead and bring it back up which should actually build everything uh but it’s not actually building the python decouple so what I’m going to do then is I’m going to go ahead and stop it um and then whenn in doubt when something like this happens you do uh d d build and that should actually build the application itself for the ones that are going to to be built uh which will install those requirements now for some reason I actually know why this rebuild did not happen and it’s because I did not do Docker compose watch so that’s something else I need to remember to do and that is Docker compose and up– watch so that it does rebuild it when requirements.txt changes so that error goes away but the key part about this is that now I have another place for my environment variables if I refresh on the docker composed version or my local version I should should be able to see the environment variables coming through for either one and now I have support for both places now the reason I’m doing this as well is so you get very familiar with the process of loading environment variables especially as you work towards going into production because you’re going to want to help isolate these things going forward now notice that this EMV do composed is underlined or it’s ready to be going into um you know git so if we actually look inside of our git step is here we can see that env. compose is in there if we look in the GitHub repo the actual sent code we can see thatv is not in here this is an example of I’m okay with sending this environment variables because composed yl is still using them and in this case it’s really just for a local test environment one that I consider disposable as then I can delete it any time obviously the usernames and passwords are not great here this also should indicate to you that the actual time scale service if you were to want to use compose in production this these environment variables should also be in an EnV file and be done differently inside of there if you were using compos and production which we are not going to be using at all we will be using Docker files but not Docker composed okay so that’s loading in the environment variables in both places as you can see Docker compose is far easier than what you would do what I just did but the other thing about this is beyond just having environment variables if you have any other static variables you want to set like other settings for this entire project you could also put those inside of this location as well now we’re going to convert our pantic schemas into SQL models this is before we actually store it into the SQL database I really just want to see how the feature of SQL model work in relation to the actual pantic models so what I want to do here is grab this schemas and I’m going to go ahead and copy it and make a new file called models so realistically what we’re doing here is just updating the previous ones to be based in SQL model and it’s going to be incredibly straightforward to do I’m actually going to name them the same thing for now we might change them later but for now we’ll leave them as event schema and so on mostly for the ports when we go to test this so the idea here is we want to not use pantic but rather use from SQL model we’re going to go ahead and import the SQL model and then also the field okay so base model is now going to be SQL model so we go ahead and replace all of these in here like so and I shouldn’t have to change the field but overall this is like the main change that I’ll need to do with the actual ID we will probably have to change that later which you can actually see in the documentation it’s going to look like this at some point so we might as well put that in there uh at least as a comment for now and then we’ll come back to that in a little bit uh but what we’ve got here is basically the same thing as we had before so inside of routing instead of schemas we’re going to go ahead and change this to simply just do models then we’ll go ahead and jump into our notebook and we really just want to test to see if all of the notebook stuff is still working let’s make sure our app is running I have it running in two places I think it looks like it is so uh inside of Docker of course and then in my local machine let’s go ahead and run this all together and everything works as it did prior um so we should have the actual models all set up as we had before okay so the big change from here would then be to change this into a table as in this is what we’re going to actually end up storing of course we’re not going to do that just yet the point of this was really to see that SQL model and pantic are the same and of course if you actually look in the documentation you will see that at the same time it’s also a pantic model so it’s quite literally a drop in replacement for this as we’ll see very soon it will also help us with storing this data now we need to create our first SQL table our database table the way we’re going to do this is a three-step process the first step is going to be connecting to the database itself using a database engine when I say connecting I mean having python being able to call the SQL table right that’s it that’s the connection that’s what database engines allow for us to do after we actually can connect to the actual database we need to decide which of our schemas in here is going to be an actual database table that might be all of them it might be one of them it might be none of them but the idea is we need to decide which one’s going to be a table and then make sure that that it’s set up the correct way to be a table and then finally inside of fast API we need to make sure that fast API is connected to all that so those tables are actually being created and they’re be being created correctly okay so the first thing that I want to do then is actually decide which of these are going to be database tables now you may have never worked with databases before or you have and you just didn’t realize how these tables work to make a decision like this so if you think of a database table very similar to like a spreadsheet you’re on the right track spreadsheet have all of those columns and you can keep adding columns as you see fit but then when you actually want to store data those columns describe what data you want to store in that particular column this is not that surprising you might have a column for ID you might have a column for page you might have a column for description all of those columns in a database need to have data types so in the case of an ID the entire column is only integers they’re only numbers in the case of a page those can all be strings and every once in a while you might have a datetime object or you might have float numbers or all sorts of different other data types there’s a lot of them out there the point is SQL tables especially inside of the structure of SQL have the ability to do very complex operations on all this data now when you think of a spreadsheet you might only have a few thousand rows in there at any given time maybe you have up to a million but if you got a million rows in a spreadsheet there’s a really low likelihood you’re going to be opening up in in Excel or something like that you probably need to use something different altogether but the idea here is a spreadsheet has limits to the number of rows SQL tables on the other hand basically have no limits or at least in many cases they’re designed to have massive amounts of data in there and so having these data types make a huge difference in how we can use those tables so for example if you want to get IDs that are greater than 100 or a, the actual SQL table itself will be able to do that very very efficiently versus trying to go one by one and looking for that specific ID there’s a lot of advantages of using a database itself to store data I’m not going to go through those advantages the point that we need to know right now is that our columns need to have specific data types now this is the cool thing about SQL model is we’ve already declared those data types integer string string right so we already have three data types in here which are going to be actual fields that will be in the database itself otherwise known as columns columns Fields you can think of them as the same thing in the case of how we use our SQL model so the idea here then is we want to decide which one of these is going to be a table now this is actually fairly straightforward if we look at how we’re using it by going into routing. PI right you could already have some intuition about which to pick already but let’s go ahead and go through each view to see how we’re using it and how this might affect what we decide to store in our database so if we look at the very first one we’ve got an event list schema this is a git we are only extracting data from the database that’s it so there’s a really good chance when you extract data you’re going to not store the extraction that you just did so that kind of is like a circular Loop if you were going to store every time you extracted something that doesn’t make sense now you might store the fact that it was extracted but you probably won’t exore the actual data that was extracted so this one I think the event list schema we can say yeah it’s probably not going to be a database table the next one create event now the incoming data is what we end up storing but do we always want to store all of that incoming data or do we want to make sure that the incoming data is flexible in other words this event create schema here if we were to change it let’s say for instance we didn’t want to store a description anymore on that crit event we want to remove that do we want to remove this field from the entire database itself the answer in that case would be no we would want to keep that field in there it’s just when we first get that data we aren’t going to want to um you know maybe have that field anymore okay so in that case this one we’re probably going to rule out but notice that it actually turns it into an event schema we return back the event schema itself now this is the one we’re definitely going to want to store so we get a payload coming in of data that’s all validated and all that then we want to turn it into a different data type which means that we’re going to end up storing it in inside of the event schema table which will change very soon now another thing about this is we notice that the event schema itself is used multiple times for database like things in other words if we are looking up that database table we see event schemas in there so that means it’s probably going to grab data from there and then also when we go to update data um we’ve got some data coming in that we might want to change and again we can decide which fields we want to actually allow for updating to happen without removing those fields from the database then we can actually return back that event schema so yeah event schema is the only one we really need to St save in here um and of course the event list view is doing a result of the event schema as well so that’s the one that’s going to be our database table it’s going to be this one I actually do not want to call it event schema anymore I now want to call it event model the reason is hopefully straightforward but let’s go ahead and actually modify this a couple times we’ll call it event model and then I want to go ahead and add in here table being true and then of course if I change this event model I’m also going to delete schemas dopy we’re just going to use models.py in this to keep it simple and then inside of routing we’re going to go ahead and do some changes in here because it’s no longer event schema it is now just event model so all the places that had event schema I’m going to change to event model now the reason I’m calling event model is you might as well think of it as event table or event database table and that’s kind of the point here so that’s why I’m going through the process of changing all of the places where it said event schema and now also when I look at the schema stuff I can kind of think of those in terms of not being a database table or not being a database model um and that’s kind of the point now these could be pantic models but in terms of the documentation from SQL model they recommend you use SQL model itself there’s probably advantages to it that we could obviously look up but for now we’ll just leave it as is now we need to connect python to SQL by actually creating a database engine instance and then actually create the tables themselves so inside of the DB module here we’re going to go ahead and create session. piy this will handle something else later which is why it’s called session but the idea here is we need to create the engine itself so engine equals to something and then we need to actually Define something that we will initialize the database itself which should be something like creating database okay so before I actually even create the engine itself we need to know when we’re going to run this init DB stuff now the idea here is in main.py this is where we will run it in what’s called a lifespan method for fast API but before I show you the lifespan method I wanted to show you the old way of doing it which is an event so if you actually came in here and did app.on event and something called startup you could then Define on startup and then this is where you could actually run the init method for DB this is still a valid method it’s still works but it is deprecated meaning it’s not going to be supported in the long run so we do something a little bit different than this now if you have these onevent stuff you’re going to want to either do all on event or none of them before you do the the context method or the context manager method which is what we’re going to do right now so what I want to come up at the very top we’re going to go ahead and bring in from context lib we’re going to import the async context manager this right here and then up here we’re going to go ahead and use that async context manager it is a decorator so we’ll go ahead and grab it and then we’re going to define the lifespan but it needs to be async Define lifespan and it takes in the app which is a instance of the fast API app so you could absolutely use the fast API app in here if you wanted to this is we’re going to where we’re going to go ahead and call in it TB and then we need to yield something and then this last part would be uh any sort of cleanup that you might want to do so basically like uh we’ve got here let’s remove this we’ll go ahead and say before app startup then it yields to actually run it and then we actually can clean up anything in here now I think the reason they brought this in was because of AI models and with AI models you might want to run this lifespan thing in here so it kind of gives us before and after inside of our app but the idea here is we can pass this in to fast API so that’s what we’re going to be doing now we want to actually bring in that in method here so we’ll go ahead and do from api. db. session we’re going to go ahead and bring in initdb now of course we actually need the database to work in here so we’ll go back into our session and create this stuff out so the first thing is going to be our database engine we’re going to import the SQL model uh itself not SQL Alchemy but SQL model and we’ll create the engine by doing SQL model. create engine and then passing in the database URL string as we see here so that also means that of course I need to go ahead and do from. config I’m going to import the database URL and we’ll go ahead and pass that in here now the important part about this database URL is it does have a default in here of an empty string so I want to actually make sure that that’s not the case so we’ll go ahead and say if database URL equals to that empty string then will raise a not implemented error saying the database URL needs to be set now the reason I’m defining the engine here altogether is because later we’re going to do something called get session which will handle or use that database engine as well so now that we’ve got that we need to actually initialize the database itself which is going to be from SQL model once again we’re going to now import the SQL model class and we’ll use this class to create everything so it’s SQL model. metadata. create all and it’s going to take in the engine as argument so now what this is going to do is obviously make the database connect to the database and make sure that that database has all of the tables that we’ve created in the case of table being true now one of the things that it will not do is it will not search your python modules for any sort of instance of this class right here that’s not how SQL model works it only actually creates those tables these tables right here when those models are actually being used in our case they are being used and routing. we actually have this the event model in here it’s being imported this routing is also being imported into our main application from event router in other words the actual SQL model metadata engine thing should actually work now and so let’s go ahead and save everything if you do save everything what you might get is an error like this where pyop G is not installed and then in you also might get it in both places actually so what we want to do is make sure that pyop g is installed right now I don’t actually have it installed in here so of course we’re going to do that in just a moment now if you remember back to environment variables we actually did this plus pyop G in here that’s part of the reason that it’s looking for it that’s why um you know we’ve got this error in altogether is it’s looking for that specifically so what we need to do then is inside of requirements.txt is we need use that I’m going to be using the binary version which is why I put that bracket in there as binary this is scop G3 or version three there is another one called pyop 2 and you would see that and it would be something like this I believe that’s the old version that you would end up using so now that we’ve got this requirements.txt of course if you’re using Docker which going to end up happening is this should actually rebuild everything looks like we’ve got a little bit of an issue going on there but if you’re not using Docker which I have both of them up for a reason you would then just go pip install d r requirements.txt and hit enter this should install all of it using pyop G binary I think is the way to go uh it makes it just a little bit easier to run the integration itself there are other ways to connect it we’ll talk about that in a second but I just want to make sure that this loads and it runs it looks like it’s loading and running but I get this error in here this is another error that we definitely need to fix and we’ll talk about in just a moment but of course on the Docker side it actually rebuilt everything and Docker looks like it is working just fine so we’ll come back to our local python in just a moment the key here though is really this lifespan in here and that it is creating these database tables now we won’t know if these are created just yet we need to solve that one error so let’s talk about what that error is and how it relates to Docker and also using Docker locally in that last part we saw an error where psychop G was not available it was not found and the reason that we saw that error in the first place has to do with how we were mapping in our database engine so if we look into our session here we see that we’ve got this create engine for this database URL the pyop G error happened because of our environment variables with this plus pyop G and we can actually see that in both places whether it’s in our local python project or in our Docker composed version of it so both of these showed us that specific error now the thing about the SQL model and more specifically the underlying technology which is SQL Alchemy is it can use more than just postgrad it can use MySQL it can use Maria DB and there’s probably others that it’s supports so this create engine here from a database URL is very very flexible which is why in ourv we have to specify postgress SQL plus the actual package we’re going to use from python to connect which is why it raised that error because we never actually installed it once we installed it that error went away but it brought us a new error which is related to connecting to that database now this is not really that clear as to this right here of course if you do a search or if you look at this it will probably show you that there’s something with a connection that’s failing and of course if we look at our Docker composed one that one is actually not failing it is working correctly or at least it seems like it’s working correctly so the reason that the python one is failing is because of where our database is coming from it’s coming from Docker compose this DB service right here so inside of the docker composed Network which has both of these apps they can communicate with one one another that’s one of the great things about using Docker compose is there’s internal networking inside of Docker compose which is why we were able to specify our database URL in this way so I could say DB service which is literally the name of the service and it would be able to connect it’s a little bit different than like going on our local browser right so when we go into our browser and look at our app it doesn’t say app here it says 00 Z right so if I tried to do app here I would get a connection error because that is not how my local machine can connect to this app it has to be on 0000 or my Local Host basically I should be able to connect with simply local host or at least I could try to connect with Local Host and that also seems to be working working just fine so the way we connect to our actual database service though is a little bit different than the way we would connect to our app itself now the reason it’s a little different is the app itself is connecting slightly different than our browser so if we actually look in the EnV here we see that it says host value if I change this to Local Host I can try that one and see if that ends up working as soon as I save it let me just restart the python application itself and see if that does it that seems to do the trick right and so if for some reason that was not doing the trick I want to show you one other thing we could try as well so let me close this out and inside of Docker compose I’m going to go ahead and get rid of this expose here for a second we’ll save that and we’ll run this again and it’s still connecting so we’re doing okay in terms of our ports are concerned but every once in a while you might need to go into EnV and change it from Local Host to being something based off of Docker itself which is host. doer. internal so that might not be another environment variable you would need to run in order for your app to be able to run as well but right now this is still also giving us an issue so that is not one that you’re going to want to use you’ll want to stick with Local Host just like our application has been using before so the reason that I wanted to show you the docker host internal one so I’ll go ahead and leave this commented out right above here uh the reason that that even exists is because you might want to try it from time to time depending on your host machine itself and how that ends up working so it’s host. doer. internal will often be mapped as well so this is of course one of the challenges with trying to use Docker compose without putting your app in there as well but to me I think developing this way is fairly straightforward but of course every once in a while you might need to use a database service that’s not inside of your internal Network itself uh so there’s a lot of different things that we could talk about there but the point here is we need to solve that single connection error now this is also more easily solved by using a actual you know database that is definitely going to work like if you went into time scale and created a database you should be able to actually bring in that connection string it might also still need to use this stuff right here because of how we need to connect but the actual host and all that stuff all of these things would then be based off of like time scale an actual production ready one that we could then go off of as well of course we will see that very soon but at this point now that we’ve solved some of the database issues we can move to the next part which is actually using our notebooks to see if we can actually even store this database stuff which actually takes a little bit more than just using those notebooks let’s take a look let’s store some data into our database the way this is going to work is first off we need to create our session function inside of the DB session module we’re going to go ahead and create something called get session here and then what it’s going to return back is with session session the session class itself which we’ll bring in from SQL model we’re going to go ahead and pass in the engine in here and then as session this is going to yield the session itself now the reason we have this is because then we can use it inside of our routes that need a database session so the idea then is grabbing that get session method so we’ll do from the api. db. session import get session we’re also going to go ahead and bring in depends from Fast API and then we’ll go ahead and bring in from SQL model not SQL Alchemy but from SQL model we’ll import the session as well so all of this allows us to then come into an event like this and say something like session and this is of the data type session and then it equals to depends and get session that’s it so this function right there will yield out the database session so we can do database session stuff down here now the thing about this function now is it’s getting a little bit bigger so I’m going to go ahead and separate things out a little bit and I’m also going to put up response model up here just like that so we don’t have to use that Arrow method anymore you could do that on all of them both methods are supported and work just fine okay great so now we’ve got this database session I want to actually use my model I actually want to store things in my model so the first thing that I want to do is I’m going to go ahead and say object object and that is going to be our event model then we’re going to go ahead and do model validate of the data that’s coming through so this data right here we’re going to pass into validate so we’re basically validating it again but more specifically to the model we’re storing it with then we’re going to go ahead and do session. add that data so when I say obj I think of this as an instance of this model or basically a row of that class but I like calling it obj it probably is an old habit you can call it what you’d like just make sure that it’s consistent throughout your entire project whenever you’re doing something like this and then you’re adding it into the database so this is preparing to add it then session. commit is actually adding it into the database then if we want to do something with this object as in get the ID we would do session. refresh of that object and then we would just return back that object itself because it’s going to be an instance of this model class itself and that’s how we do it that’s how we can add data into our database it’s really just that simple now of course it we have to remember that if we want to use a session this is the argument that has to come in there we use the session variable through here if I called this sesh that would be fine you would just want to update it down here that’s not common so definitely consider that when you change these names but the idea is of course it’s a session class and it depends on that function which will yield out a session based off of our database engine which is of course why I have that database engine up there in the first place so now that we’ve got this we’ve got a way to actually add in data which we can verify by going into our notebook and trying this out so I’m going to restart in my notebook here and I’ll just go ahead and run it and what we should see is there’s some data in here with an ID now do we recall whether or not that ID was valid before I don’t know let’s do it again there’s another ID and we do it again there’s another one and we do it again and another one so we’ll Contin continuously increment this ID even though we didn’t specify it directly inside of here all we are specifying on that API request is the data that’s coming through and so it’s automatically incrementing it for us thanks to the features of a SQL database in this case it’s a postcg database but we’re automatically incrementing with this being a primary key as I mentioned before and we can see that result right here when we go through and actually add in more data now of course we need to fill out the other API routes as well which we’ll do in just a moment but before we do that I will say every once in a while as soon as you start filling up the database with some of this test data you’re going to want to come in here and do Docker compose down with that- v to take down the database itself this will delete that database and that Network it alog together and then you would just bring it back up with that watch command just to make sure that everything’s changing as need be now the database is completely fresh which we can then test again by jumping back into that notebook and running the same commands now we get a connection error with the notebook which doesn’t make that it makes a whole lot of sense with how these sessions work within a notebook so I’m just going to restart the notebook and then I’ll go ahead and try that again and I’m still getting a connection error so that’s probably not good this is probably a little bit of an issue with one of these things so let’s go ahead and restart this application which Port are we using up here let’s see here we’ve got a 802 so everything should be back up and running again let’s see looks like the database is ready let’s go back into the notebook here and maybe I just went a little too fast for it and there we go so now it’s back into being connected with that one let’s try this one here and there we go so we should be good to go in terms of that data and there it is auto incrementing okay so every once in a while you might need to let things refresh and actually rebuild as well so saving data and doing all that will help flush out that process because you’re probably not going to bring the database down that often especially not while you’re talking about it we’re now going to ask our database for a list of items a certain number of items that we want to come back this is going to be known as a database query that will return multiple items so the first thing that we want to do is look into read events this is where we’re going to be changing things because well that’s our API inpoint to list things out now the first thing I want to do is bring in the session because I’m using the database so I definitely want to have this session in there to be able to use that as well now once again we can use that same response model idea up here as well but this time instead of the event model it’s going to be the event list schema which we of course already have format it out right down here so the idea is we want to replace this list right here with the actual results so the way we do this is by doing something called a query so we set it equal to a query now if you’re familiar with SQL you might be familiar with something like select all um and then from some sort of database table itself which in our case is the event model we’re basically doing that as well but we’re going to be doing it from the actual event model class in other words the actual SQL model om or uh so which means that we need to bring in this select call here which we will go ahead and do select from this event model and so that is a basic query that we can do then from there we can actually get the results by executing this in the database itself which is our database session. execute and then the query itself and then the final thing is we would do all this would allow us to get a bunch of items in here and which case I should be able to return them just like that the reason I should be able to return them is because of this model itself is inside of the list schema also right so going back into that schema itself we see that the results are looking for instances of that model which is exactly what we’re doing here we’re returning those instances in this request that’s what’s happening with that so that’s at the the Baseline what is going to happen and then we can also grab the length of those results and then there we go so this should give us some results in here now if we actually go back into our notebook for the list data types here I added this print statement in here so we can actually see it a little bit better which allows us to see hey there’s the ordering that’s going on there’s all of that data of course if we were to add in some more data we can see this again so I’m going to go ahead and add in a few more items in here and just run this cell several times to just make sure that I have maybe over 10 items in here as we can see by this ID or maybe even over 12 so later we can use that 12 number again so now that I’ve got 12 in here let’s go back into my list data types and run this again and we can see there’s my 12 in here so this becomes a little bit of an issue is our results here have 12 items but maybe we actually only want to return back 10 items in which case we would come back in a routing and we would need to update our query here by having something called liit and then we can pass in the amount that we might want to limit in terms of the result itself in which case I should be able to go in here and then I can run that limit itself and then there’s 10 as a result and we see that there’s only 10 items in here now I can also update this so the ordering or the display order is a little bit different as well so back into our routing here we’ve got this query still we can come in and then do something like order bu and we can add in an ordering in here now how does this ordering work well it’s going to be based off of the actual table itself which of course that table is our event model inside of that table we have different fields in there in my case I can use ID that field is this one right here we could use page your description as well but those are the only fields we can really order this by and then we can do descending which just changes the order it flips the order in here which we could verify by going back into our API and doing another request and we see now it’s flipped and ID of 12 is coming up first and two and one are no longer in the results altogether and of course if I wanted to flip it back I can change it to ascending and of course that will flip back that order with different values in here okay so this would be like okay the next level of this would be doing pagination or allowing to have only 10 results coming back but then multiple pages of 10 so you can you can do this request several times to get all of the data cuz realistically you would not get all of the data once you know you would probably never do this command that’s probably going to end up being too many things so you would often use a limit you might have a bigger limit like 100 or 500 or a th000 it really is going to be depending on your API capabilities but the idea here though is we do want to have a limit of some kind and by default you may have noticed that the actual ordering was based off of the ID it was actually going up like this it was doing that ascending from the ID um which is also very nice it’s a very easy thing to work with but that’s our list view now that we’ve got that we might want to make it even more specific as in getting a detail view a very specific item that we can work with and then maybe eventually update as well let’s take a look using SQL model we will do a g or 404 error for any given event now what we’re going to do here is very similar to what we’ve done before first off I’m going to use the response model again and this time it is just going to be the event model like that and then now we want to build out the session query itself so I’ll go ahead and grab the session data and we’ll bring this in as well now what we want to do of course based off of this model we want to look it up for this event here so we need to define the query first which is going to be our select call and it’s going to be based off of this model here and then we use something called where and we can grab that model do a field name and we can set it equal to this event ID that’s being passed through and that’s going to be our query now so then from that query we get a result which would be something like the actual data itself which of course is going to be our session. execute and then we’re going to grab that query and instead of using all we’re going to use first the value here would then be returned and that’s pretty much it for our detail view or is it it’s Clos close it’s not quite the whole story jumping into our list data types here we’re going to go ahead and copy this git event down here and I want to just change it ever so slightly so that our path is going to be updated to be more appropriate for a detail view so I’ll call this detail path and that’s going to take in you know something like that 12 and then we’ll use the detail endpoint and I probably don’t need to put that base view in here anymore okay so now we’ll go ahead and grab that detail uh endpoint here for our request and I’ll just go ahead and set this to R and then we will do R all across the board and once again PR print this time instead of status R okay we’ll do our status code in here and see what that looks like okay so if we look at our list view we should still see uh that there is probably 12 in here right so let’s go ahead and just change our list view real quick to descending just to make sure that we have some data in here that is accurate to what’re looking up okay so back in here and list view there we go so we got an ID of 12 I’m going to go ahead and run this and there we go so we get that data of 12 if I go one Higher I get a 500 error that is a server error this should not raise a server error it should raise a different kind of error so this is not quite done and it has or rather this is not quite done and it has to do with this right here this query res returned back a none value and we can actually see the error in here that we get the input is simply none right because we’re trying to return back the event model but we’re actually returning back none so really what we need to do is say if not result then we need to raise something called the HTTP exception and it’s Capital HTP exception and we want to add in a status code in here of 404 and then some sort of detail saying that this item was not found or event not found right and so we need to bring in this HTTP exception from Fast API we’ll bring this in like that save it and now we’ll go ahead and try out this same sort of lookup and we’ll do it again now we get a 404 the actual error that you should have for a item that’s not found not found in the database in this case and so the cool thing about this is the same sort of idea if I made a mistake on this lookup and turned this into let’s say a string value here and and still did the lookup let’s take a look at what happens there 500 error this time so let’s go ahead and see what that error is right so it’s looking for the parameter with the ID of 13 so in this case it’s actually not looking up the ID uh based off of the string value right it’s not going to do that it’s going to give us another kind of error which shows us yet another time that hey you want to make sure that you’re using the correct data types when you do this lookup now the important part about the actual lookup itself the request that I’m making this is a string it’s just fast API parsing it into a number so that’s also another nice thing about this um but overall we now have our git lookup uh it’s fairly straightforward the next part is using this same sort of concept to actually update the data which kind of puts it all together the delete method I will leave to you it’s actually fairly straightforward and there is plenty of documentation on how to delete events which is something I’m not really concerned about at this point but updating one is a good thing to know because it kind of combines everything with we’ve done so far let’s take a look let’s take a look at how we can update the event based off of the payload that’s coming through keep in mind the payload is only going to have a certain number of fields the way we have it is literally the description that’s the only thing we’re going to be able to change in this update event which is nice because it limits the scope as to what you might potentially change so the idea here is we’re going to go ahead and grab the query and all of this stuff just like we did in the G event I want this to happen before I go to grab the payload data CU I don’t need it if I don’t have the correct ID now of course we still need the session in here as well and then of course I also want to bring in my event model as the response model to make it a little bit easier as well so I’ll go ahead and copy that and paste in here there we go and then I will update the arguments here just a little bit so it’s also easier to review as we go forward okay so let’s let’s go ahead and get rid of this now now I’ve got my session I can still look everything up as we saw before and this is now going to be our object this is really what it is I had it as result up here but it’s the same thing it’s still an object just like when we created that data so in other words we still need to do basically this same method here just like that and then when it’s all said done we will go ahead and return back that object so the key thing here though is the data itself how do we actually update the object based off of this data itself the way you do that is by saying for key value in data. items this of course is unpacking those things we can do set attribute and we can set attribute for that object of that key value pair that’s in there and again the actual incoming payload is only going to have a certain number of keys if for some reason you didn’t want to support those keys you could set that here so say for instance if k equals to ID then you could go ahead say something like continue because you’re not going to want to change that field but again you shouldn’t have to do this at all it’s a bit redundant based off of the actual schema itself but if you were to make a mistake then yeah of course you might want to do it then okay so with this in mind I should be able to run this and get it going so the idea here then is just checking this out or testing it out so if we look into the send data to API we have the API endpoint for this which is basically what we’ve done before so now we’ve got the event data here’s the description that we want to bring in this time I’m going to go ahead and say inline test and we’ll go ahead and run that now we’ve got inline test in here if I go back and list out that data let’s go ahead and do that and see what the description is now and there is that inline test so we were able to actually update that data as we saw uh just like that okay so this process is very straightforward of course if we tried to update data for an item that does not exist we will get a 404 right so going back into the send data here I’m just going to change it to 120 now we get a you know event not found and if of course if I change this response to status code we should see a 404 in here coming through great so yeah the actual API endpoint is now working in terms of updating that data again this one we probably won’t use that often but it’s still important to know how to update data in the case of an API not just with what our project is doing right and of course the challenge I want to leave you with is really just doing the delete method yourself self which is actually pretty straightforward given some of this data in here as well so at this point we now have update we have create and we also have list so we’re in a really good spot in terms of our API the last part of this section what we want to do is adding a new field in this case I’m going to be adding a timestamp to our model to the event itself having an ID is nice because it will autoincrement but actually having a timestamp will give us a little bit more insight as to when something actually happens now what we’re doing here is we are actually going to delete our previous models and then go ahead and bring things up in other words I’m going to come into my Docker compose down and then- V this will of course delete the database so it’s a lot easier to make changes to my database table so what you would want to do in the long run is use something like a limic which will allow you to do database migrations it will like quite literally allow you to alter your database table but in our case we’re not going to be doing that because once we have our model all well organized we should be good to go and not need to change it very often and if you do running this down command is not really that big of a deal altogether now before I start this up again and this is going to be true for any version of this app I want to build out the actual field that I’m going to do right now so the idea here is we’re going to go ahead and do something like created at and this is going to be a date time object so we’ll go ahead and bring in from date time we’re going to import date time okay and this is going to be equal to a field here now this field takes a few other arguments there’s something called a default Factory this default Factory has to do with SQL alchemy that will allow me to do something like Define get UTC now and then I can return back the date time. now and then we can actually bring in the time zone in here and do time zone do and UTC and then I can just go ahead and say um that this value should be replaced for the TZ info of time zone. UTC okay so just making sure that we are getting UTC now this will be our default Factory so when it’s created it’s going to hopefully go off of that function and work correctly now when it comes to the actual database table there’s another thing that we need to add in this is going to be our SQL Alchemy type the actual database type itself which is going to take in from SQL model which we need to import SQL model as well and oops that’s not what we wanted so we’ll do SQL model here and then I’ll go ahead and bring in SQL model. dat time and then this is going to be time zone being true so when it comes to SQL Alchemy you need to Define things for datetime objects it’s definitely a little bit different than what we have here but the SQL Alchemy stuff does creep in a little bit with SQL model when you’re doing a little bit more advanced fields datetime is one of such field and then we’re also going to go ahead and say nullable being false okay so doing nullable being false is yet another reason to take down the database and then bring it back up so at this point we should have this field automatically being created for us when we go forward okay so now I’m going to go ahead and run the Ducker compos up again and this of course should create the database tables for me and I’ll also go ahead and run my local version of the Python app as well which is attempting to create those database tables as well so the only real way I can test to make sure this is working at least how we have it right now is to create new data so we’ll go back into this notebook that will allow for it and hopefully everything’s up and running as it was looks like I’ve got a connection aborted so maybe I need to do a quick save on one of my models just to make sure that that’s up and running looks like it is now so let’s go ahead and try that one more time we definitely saw this before so I’ll go ahead and run it again and it looks like that time it worked okay great and so now if I go through this process I can see there is a time zone being created in here and of course I could do this as many times as necessary to update future Fields if I wanted to which in my case I’ll just go and go to five and notice that created at is in there as well now one of the other things that you might consider using this same idea of creat at is updated at so you could add a whole another field that’s very similar to this but it won’t have a default Factory uh necessarily you could still leave a default Factory but if you wanted to create it at field or an updated at field in your update you know command here you would then go ahead and set it again so you would say something like updated at and this is where you would go ahead and be able to set that field itself for that time stamp when it may have been updated that’s outside the scope of what we wanted to do here but the point is I wanted to make sure that I have a bunch of different fields in here that make a whole lot of sense in terms of actually storing data especially when it comes
to a datetime field like this we need to see how we can actually store individual objects itself um at any given time and also have a field that is automatically generating inside of our database which is exactly the point of this whole thing let’s actually take a look at how we can do that updated field I’m going to go ahead and come into our event model here we’re going to paste this and change it to updated at just like that there shouldn’t be a whole lot of changes we would need to that but of course this also means that I need to go ahead and do Docker compose down of that actual entire thing so we can bring it down now back into our routing what I want to do is I’m going to bring in the git UTC now method here so it’s using the same functionality as when it’s being created so that it grabs what that value would end up being based off of all that data so then in our update view then we come in here and I’m going to go ahead and bring in object. updated at and then this is going to be the G UTC now and then that should actually set it and get it ready for when we go and store it okay great so with this in mind I’m going to go ahead and run this again we’ll bring it back up now one of the things the reason that we have it with the same default Factory is because when it’s initially created we might might as well also set the updated at field at the same time so they’re going to be the same value when they’re initially created but then in the future they might change if you then update them and let’s just make sure that everything is running looks like it’s all running we’re good to go so let’s go ahead and give this a shot so first and foremost we want to just send some data in which I will do with that post method right there and there we go so we’ve got an ID of one in here I’m going to change this to an ID of one as well and we should see that updated at and created at are the same time um it might be slightly different because of the microsc but overall the actual seconds are the same as we can see right here and right here great so now I’m going to go ahead and update it and we should now see a change the created at should be the same the updated AD should be slightly different and it’s not really that much different but it is different it’s different enough to see that that actually ends up working so that’s how you end up going about doing that now of course you could also have a field that is completely blank you don’t necessarily have to automatically update it uh in the future but that’s just a simple way to make that change so you can really see those time differences uh if you wanted to now the other nice thing about this too is then in our list view what we could do is we can come in here and do updated at and go based off of that value instead of we could also do created at of course but now we can go off of a different value alog together in which case we can come back in here and we can see the different data items that are coming through in here which we only have one right now so let’s go ahead and send a couple more and of course uh we’ll update the first one again just to make these things a little bit out of order and so let’s go ahead and try that again and what we see in here is now the different data is coming through on how it ends up looking the updated at is the first one because of course that’s how we defined it but I can always re flip it if I wanted to based off of the routing itself so now ascending so the oldest one being first we can go ahead and take a look again and let’s list that stuff out and now we’ve got the oldest one being first the most the the oldest updated one not the oldest created one uh being first and then the most recent created one being last or updated one being last again okay so cool um very straightforward uh I think that would have been really easy for you to do off the video but I just wanted to show you some techniques to do it in case you were curious about it and most importantly to show you this ordering stuff as well as to you can use updated and created app we now know how to store data into a SQL database thanks to SQL model it makes it really straightforward to use this structured data that is validated through pantic and then actually stored in a very similar way as it’s validated which I think is super nice and then extracting that data really looks a lot closer to what the actual SQL is under the hood which I think will actually help you learn SQL as well if you are trying to learn it because the way this query ends up being designed designed is it’s very much like SQL itself so SQL model I think is a really really nice addition to the python ecosystem and of course I’m not alone in that it’s a very popular tool but the idea here is we have the foundation to read and write data into our database now we need to elevate it so we can read and write a lot more data and do so by leveraging time series when it comes to analytics and events specifically we definitely want to turn it into time series data now the difference between it are going to be mostly minor based off of all of the tooling that’s out there so actually turning this into time series data is very straightforward so let’s go ahead and jump into a section dedicated directly to that we’re building an analytics API so that we can see how any of our web applications are performing over time in other words our other web applications will be able to send data to this analytics API so we’ll be able to analyze it and see how it’s doing over time the key word is overtime there now a big part of the reason that our event model didn’t have any time Fields until later was really to highlight the fact that postgres is not optimized for time series data so we need to change we need to do something different and of course that change is going to be time scale and specifically the hyper taes that time scale provides now we have been using a time scale database this whole time but we’ve been only using the postgress portion of it so now we’re going to convert it into a hypertable because that revolves around Time and Time series data it’s optimized for it now a big part of that optimization is also removing old data that’s just not relevant anymore now it removes it automatically and of course you can build out tooling to ensure that you’re grabbing that data if you wanted to keep it in some sort of Cold Storage long term but the point of this is that we want to optimize analytics API around time series data so we can ingest a lot of data and then we’ll be able to query it in a way that we’re hopefully kind of familiar with at this point by using postc so at this time we are now going to be implementing time scale and specifically a package I created called time scale python we’ll look at this in just a moment but as you can see here this is exactly like our SQL model with a few extra items a few extra steps to really optimize it for time scale DB now these extra steps are not really that big of a deal as you’ll come to see and a big part of the reason they’re not a big deal is because the time scale extension is the one that’s doing the heavy lifting this is just some configuration to make sure that it’s working let’s go ahead and get it started right now we’re now going to convert our event model which is based in SQL model into a hyper table or at least start that process so we can start leveraging some of the things that time scale does super well so the way we’re going to do this is we are going to go ahead and bring in from time scale DB we’re going to import the time scale model and then we’re going to swap out SQL model for that time scale model at this point it is still mostly the same so if we actually look at the definition of the time scale model by hitting control click or right click you can see the definition of this model it is still a squel model it is based off of that and then we’ve got ID and time in here so two default fields that are coming through and then some configuration for time scale specifically but let’s take a look at these two Fields we’ve got an ID field hey that looks really really familiar to what we were just doing with our ID field it has another argument in here that is related to SQL Alchemy on SQL model if you ever see sa like that that is referring to SQL Alchemy which is a another part or dependency of SQL model um it’s an older version of SQL model if you want to think of it that way now the other thing here is is we’ve got a date time field which hey what do you know we’ve got a default Factory of get UC UTC now it is a datetime object with the time zone and it’s a primary key as well so we actually have two primary keys so the hyper tables which is what we’re building is going to rely on a Time Field absolutely which is part of the reason we declare the time field altogether the other thing is we have a composite primary key which actually uses both of these fields as the primary key so the idea here then is we want to actually use this time field instead of the one we have so if we look back in our event model we had this created at field in here which we can now get rid of as well as this ID field now the other thing you’ll notice is we’ve got this get UTC now it’s in both places and so we can actually use the time scale DB version of get UTC now instead of having our own now of course you could still have your own but it’s not necessary at this point since we have this third party package that we are relying on we can use other fields to rely on it as well so the last real thing about this model is deciding what is it that we’re actually tracking here now in my case what I’m trying to track or the time series data that I’m looking for is Page visits so how would we think about page visits well that’s going to count up the number of visits at any given time right that’s the thing that’s what we’re counting up this is a little bit different than like sensor data for example so if this was sensor data you would have a sensor ID instead of a page which would be an integer and then you would have a value which might be a float right and so you’re going to then do something like the um you know value or maybe let’s say the average value of a sensor at any given time right so that would combine those two Fields but in our case we want page visits we want to count up the number of rows that have a specific page on it so the reason I’m bringing this up is mostly so that when we think about designing this model we want to rethink What fields are required in the case of page that one is absolutely required we we definitely need page otherwise it’s not an event so we’re going to go ahead and say that it is a string and this time we’re going to go ahead and give it a field with an index being true now making an index makes it a little bit more efficient to query but the idea here of course is going to be it’s going to be our about page or our you know contact page or so on and of course inside of time scale or postc in general we can aggregate this data we’ll be able to count the pages we’ll be able to do all sorts of queries like that which we’ll see very soon but the rest of the actual event model we could either get rid of or we can still use the updated at I’m probably never going to update this data but it is nice to have that in there and then a description in there as well so we’re really close to having this as a full on hypertable it’s not quite done yet and we still have to modify a couple things those are related to how we are using our session so we’re going to change our default engine in here and then we need to run one more command related to initializing all of the hyper tables it’s time to create hyper tables now if we did nothing else at this point and still use this time scale model it would not actually be optimized for time series data we need to do one more step to make that happen a hypertable is a postgress table that’s optimized for that time series data with the chunking and also the automatic retention policy where it’ll delete things later this is a little bit different than a standard postgres model but again if we don’t actually change anything it will just be a standard postgres model now we want to actually change things we want to commit those changes now the way we’re going to do this is inside of session. high we’re going to go ahead and import the time scale scale DB package in here and underneath in a DB I’m going to go ahead and print this out and say creating something like hyper taes just like that and it’s as simple as doing timescale db. metadata. create all engine this function here is going to look for all of these time scale model classes and it will go ahead and create the hyper tabls from them if they don’t already exist it’s really just a conversion process it’s still a postc model under the hood and then it actually turns into a hypertable now there are issues that could come up with migrating your tables so I’m not really going to talk about that right now it’s kind of outside the scope of this series if you’re going to be changing your tables very often but the point is that this right here automatically creates hyper tables the time scale DB package has a way to manually create them as well especially when it comes to doing migrations and making those changes that a little bit more advanced now the big reason that I’m talking about that at all is because we want to make sure our model is really good Ready Set because we probably don’t want to be changing our time scale model very often or really our hypertable very often unlike your standard postgres table you might change that more often than something like a hypertable now the idea here then is after we’ve got our creating hyper tables we have one more step that we need to do and that’s changing our create engine because we actually want a time zone in here as well so there is an argument that you can pass in time zone whichever time zone you are using in our case we’ve been using the UTC time zone which of course we see in our model itself where it’s get UTC now that’s going to be the default time zone we use and it’s the default one for the time scale model anyway now I recommend just sticking with that time zone because you can always change a time zone later using UTC makes it really easy to remember hey my entire Project’s in UTC we can convert it if we need to okay so the idea also with this time zone is we might want to put it into our config.py here which in which case I would go ahead and do something like DB time zone and then just paste that in here and then have the default being UTC once again in which case I would then import it into my session. and do something like that great so we’re not quite ready to run this command and when I say run this command I mean that we need to take down our old database and then bring it back back up so we can make sure that everything’s working correctly including any changes that I made to this model but there are a few other things that we need to discuss before we finalize this model and then has to do with some of the configuration that we want to use inside of our hypertable itself there’s a couple configuration items we want to think about before we finalize our hyper tables so let’s take a look at the time scale model itself and if you scroll down there’s going to be configuration items in here maybe some aren’t even showing yet the main three that I want to look at are these three first is the time column this is defaulting to the time field itself right so we probably don’t ever want to change the time colum itself although it is there it is supported if you need to for some reason in time scale in general what you’ll often see the time column that’s being used is going to be named time or the datetime object itself so that one’s I’m not going to change the ones that we’re going to look at are the chunk time interval and then drop after so let’s go ahead and copy these two items and bring them into our event model and I want to just go ahead and set these as a default of empty string here so what happens with a hypertable is it needs to efficiently store a lot of data which means that it’s going to store it in a chunk each chunk is like its own table itself now you’re not going to have to worry about those things time scale will worry about them for us the things we need to worry about are the interval we are going to create chunks for and then how long we want to store those chunks for in this drop after so this is actually very well Illustrated inside of the time scale docks by taking a look at a normal table and then taking a look at a hypertable a hypertable is a table and then it has chunks within that so as we see here we’ve got chunk one 2 and three and then each chunk corresponds to the interval that we set so for example this chunk is for January 2nd this chunk is for January 3rd and so on because the time interval is for one day the entire chunk is a day at a time that’s kind of the point here hopefully that makes a little bit of sense but the point of this also with respect to chunks is if you have a lot of data how you handle your chunks is going to be different if you have a little data right if there’s very little data your chunks could probably be a lot longer because you’re not taking up that much room in any given chunk and also you’re probably not going to be analyzing that data much for any given chunk so for example if this is your very first website your chunk time interval might be 30 days and it actually is going to be listed out as interval 30 days like that and so that’s how you write it you could say something like 30 days now one of the downsides of having an interval of 30 days or too long of an interval is you will have to wait after that interval passes to make a change so say for instance right now we say hey it’s going to be interval of 30 days that means all of the data coming in right now has a chunk time of 30 days if in a day or a week I want to change it down to one day then we have to wait that full 30 days before this new interval would end up changing now these automatic changes right now are not currently supported by time scale DB’s python package or the one that I created at least maybe at some point it will be but the point of this is we want to decide this early on I actually think interval of one day is pretty good for what we’re going to end up doing if we start end up having a lot of data then you might want to change this interval to like 12 hours but more than likely I would imagine one day is going to be plenty for us so the next question comes when do we want to drop this data now time scale will automatically drop this data based off of what we put in this drop after so in the case of our interval here we could absolutely say drop after 3 days but that actually doesn’t give us much for time for analyzing our data and what do I mean by drop after I mean quite literally after 3 days this thing would be deleted this would be completely removed from our table it’s not going to be there anymore this helps make things more uh optimized and also improve our performance altogether while also removing like storage that we just don’t need anymore we might not need that anymore so we could always take that old data and put it into something like cold storage like S3 storage but what we want to think about here though is when realistically would we want to drop some of this old data again it’s going to be a function of how much data we’re actually storing so this interval can also change as well so for example if I did two months then I would only have at most the trailing two months data if I wanted to have a little bit longer than that like 6 months then again I would only have at most 6 months of data at least in terms of querying this specific time scale model again we can always take out those chunks and store them later if we wanted to in my case I’m going to go ahead and just leave it in as three months I think that’s more than enough you might also change it to one month and so on um and you can change these things later but I wanted to make sure that they’re there right now so that we have them and we understand kind of what’s going on with our hyper tables or hopefully a lot more as to what’s going on with our hyper tables so the way we’re we’re going to be chunking our data is going to be with one day and we’re going to keep the trailing one day for 3 months so we’ll have at least that much data going forward let’s go ahead and verify that our hyper tables are being created first and foremost I’m going to go into session. here and I’m going to go ahead and comment out the process of creating those hyper tabls because we’re going to do it in just a moment then I’m going to go ahead and run Docker compose down just like that and then we’re going to go ahead and bring it back up so that we can quite literally have this working once again okay so we should see creating database regardless of how you end up doing it but the idea here is we now have our database up and running so to verify this what we’re going to be using is something called popsql or popsql decom and just sign up for an account it’s free especially what we’re doing and you’re going to want to download it which is going to be available on this desktop version right here and you can go ahead and download it for it your machine once you do you’re going to open up and you’re going to log in you’re going to probably see something like this so the idea here is you’re going to select your user and click on connections and then you’re going to create a new connection so we’ve got a lot of different connection types in here that we can use to really just run different queries that we might want to in our case right now we’re just going to go ahead and verify that our database is a hypertable database or has a hypertable in it so that means that we’re going to be using time scale here now the idea here is we could also use post credits both of them are basically the same right the connection type is basically the same it’s just a little bit different so we want to use time scale to verify that it is a hypertable when we look at it and then the next part is going to be our connection type we’re going to go ahead and use directly from our computer we want to make sure we do that and of course this is going to be on Local Host and the port here is 5432 the standard postest port and it’s also the same port we defined in composed. yl this port right here the first one of the two great so now we’ve got that we now need to fill out the rest of it the database name which of course is our time scale DB the username which is going to be our time scale user and then also the password time PW and that’s it then I’m going to go ahead and save that and we can run connect and now what we should be able to do is grab this data so I’m going to refresh here and we can see there’s our event model and I should be able to go to the event model itself and we can see that it’s not a hypertable at least not yet so back into our session here I’m going to go ahead and change it to see if it will create those hyper tables from here so as soon as I save this I should see it says creating hyper tables and hopefully I don’t see any other errors this is true regardless of what version we’re using here uh but ideally speaking we have all of our hyper tables so if I refresh in here this might actually show me my hyper taes it might not uh so the reason that it might not be showing our hyper tables is maybe we need to just start over a little bit in which case I’m going to go ahead and go Docker compose down again and then I’m going to go ahead and do Docker compose up just to see if that ends up solving it it might be a caching thing as well so we would also look at that but there we go creating hyper tables and all that let’s go ahead and refresh in here right now it’s still saying no which it shouldn’t so let’s go ahead and close out popsql for a moment and see if we can just reopen this up again I think it’s a caching thing which is not that big of a deal at all we just need to refresh pop SQL itself so that we’ll be able to run with our previous connection now there’s a chance that we might need to reenter our credentials I don’t think that’s going to happen uh but here we go and now it shows sure enough it shows a hypertable right away and notice that compression is disabled currently but it does show us the number of chunks we have so what if we actually start making some data the way we’re going to do this is by jumping into our notebook here I went ahead and had this one single event right so we had this post event that we did before off the video I went ahead and did this event count where it sort of randomizes stuff but it creates 10,000 events for us so I’m going to go ahead and run that and what we should see in our t uh table at some point it will just be one chunk uh so it’s going to take a moment maybe for that to be flushed out completely or maybe we need to do a query alt together I’m not actually sure um in terms of why that’s not showing up just yet but it is bringing in all sorts of data for us which is pretty cool so we will’ll test this out a little bit more the key of this was just to verify that it does say hypertable notice that compression is disabled that goes back into our time scale model where the default for compression is false so you can absolutely add in compression to make the data even more optimized for storage uh but for now we’re just going to leave it as is and then we’ll go through the process of actually seeing all of this data we will use uh popsql a little bit more to run some SQL queries or really to verify them but we’re not going to run that many SQL queries instead we will do the related queries right inside of our notebooks using um you know SQL model as well as time scale DB the python packages now what would be best is if we actually ran this a number of times over a number of days to simulate the you know actual process of visits but of course I can still run this a lot and we get you know 10,000 visits per time or 10,000 events per time and we’d actually be able to see a lot of data coming through and here uh at any given moment right so 10,000 events and of course this is happening synchronously so it’s not quite like what a real web application would be like uh but it is still at least giving us a bunch of data that we’ll be able to play around with let’s take a look at how we can do some of the querying with the SQL model now that we have a hyper table and we are able to successfully send data to it I want to actually be able to query this table and I want to do so in a way that is still leveraging the event model itself before I create the API endpoints now we have already seen how we can query it in fast API but I want to actually design the queries outside of fast API before I make it an API endpoint itself just to make sure I’m doing it correctly now the way we’re going to do this is inside of a Jupiter notebook so we’ll go ahead and do NBS I’ll go ahe and do five and we’ll go ahead and do SQL model queries. IPython notebook and then in here I’m going to go ahead and import a few things the first thing is going to be S the next thing is going to be from pathlib we’re going to go ahe and import our path here so when it comes to being able to import something from a model like models.py there’s no real clean way to do this I can’t just go from source. api. events. Models import event model right this will most likely not work because there’s no module named SRC now this is in part because we’re inside of the NBS folder here and so there’s a number of different ways on how we would be able to import this or grab it but the way I’m going to do it is I’m going to say my SRC path equals to path of dot dot slsrc and then we’ll just go ahead and resolve that right and so if we resolve that and take a look at what it is that is that SRC folder right there and so what I want to do is I want to pin this to my system path so all I can do is sy. path. append will take that string of that SRC path here just like that okay so that will append it to our system path which means then I can actually just go ahead and do instead of SRC I can just use api. events. models and I should now be able to grab that event model and have some no errors related to it and I mean I could also just go through and manually add all of this stuff in as well into the notebook itself but having this event model allows for me to do a few other things as well like I can go ahead and do from api. db. session we can import the engine as well which of course is going to give us this engine right here which has the database URL already on it or at least in theory it does so that’s another part of that as well let’s go ah a and run that one sure enough it ends up working so next up what I want to do is I’m just going to do a very basic lookup a very basic query for this first part and then we’ll come back and start doing more and more advanced queries so what we want to do here I’m going to actually put all of the Imports in the same sort of area we’ll go ahead and bring in from SQL model we’re going to bring in the session itself and then run that cell and then to use this session we’ll do with session and then we’ll pass in the engine in here and then we’ll go ahead and say as session and then I’ll be able to do my different session things that we might want to this of course is going to be basically the same as what we’re doing in our rep it just fast API handles it for us with that session stuff and all that but we can actually do the same exact query in there so let’s go ahead and grab that and just verify this in here with our session and I should be able to print out the results in here as well um we also need to import the select so let’s go and do that as well and now we’ll run this and we should get the same sort of data coming back okay so that gives us a result but what I really want to see right now is before we build out the rest of the queries I want to actually see the raw SQL query here now what we do is we bring in compiled query equals to query. compile and then we want to go ahead and bring in something called compile uh keyword RS here which I’ll just go ahead and copy and paste but the idea is compile keyword ARS the literal binds being true I’ll show you why right now so if we print out the compiled query and then if we print out the string of the actual query we see a couple different things so I’m going to go ahead and print this one little line to separate a little bit so the compiled query query is this one right here the other query has a parameter in here that does not have the literal binds in here at all so the nice thing about this is we would be able to change the limit with that parameter uh this is not something we’re going to do right now the point is I want to be able to use this compiled query by going into popsql I open up a new tab in here and I can paste that query in and just run that and so what it’ll do is it’ll run everything that we have in here with all of that data so we’ve got the pricing data and all the pages and all that so notice that the pricing data looks like we when we import put it in it didn’t have a initial slash in there which is kind of interesting uh but overall the data is in there and there’s a lot of data going on and this is how we’ll be able to verify that our queries are good and so we can continue to refine what kind of queries we’ll be end up doing inside of our database now when you’re working with database technology it’s a really good idea to do some of these queries just to verify that it’s working before you were to actually put it somewhere else so now that we’ve got that I want to jump over into doing time buckets we’re now going to take a look at time buckets time buckets allow us to aggregate data over a Time interval so it’s actually a really nice way to select certain data that falls into an interval in our case we’re going to end up counting the correct data for the correct pages but we want to start with just the basics of the actual time bucket itself so one of the things I want to do is grab this session here because we will be using the session itself and we’re going to be working based off of roughly this and so the nice thing about these queries here is we can put it into a parenthesis and of course this is just python related but then we can sep create them out line by line now I don’t really care about the limit right now so I’m going to go ahead and get rid of that and I’m not actually going to run this query we are going to chunk it down a little bit by using the time buckets so to do this we’re going to go ahead and do from time scale db. hyperfunctions we’re going to go a and import the time bucket idea here and then we’re going to go ahead and add in bucket equals to time bucket and then we’ll add in some amount of time I’ll leave it in as one single day now we want to do it on a specific field that we’re going to end up using which in our case is that event model. time now this is where keeping the actual field the time field consistent makes it a little bit easier so if I need to change to a different model time is still that time so that’s kind of the idea here so we can actually select it based off of that bucket itself we can come in here and just select that bucket we can also add in our page in there as well so if I do event model and then page I can bring that one as well that of course is an arbitrary field at this point but now what we can do is actually see this so I will leave that compile query because I might look at it in a little bit but for now I’ll go ahead and do results equals to session and then execute of the original query and then we’ll go ahead and do fetch all to see what that looks like and I’ll use prettyprint to actually do this so let’s go ahead and import prettyprint so from prettyprint or PR print import PR print okay so I run that and now I can see all of this data coming through based off of time and all that so there’s a lot of data I’m actually not sure how much data is in there it’s not quite what I want though right so I want to change this a little bit and of course we don’t order it by updated at that’s not even being selected we’ve got our time bucket and then our page those are the only things that are being selected and if we scroll down we will see that there is all instances of that so like it’s going to have duplicates in here but it actually removed the original time to just being based off of the date right it’s not quite the same as what you would see up here which of course we could verify by running the results up there as well so if I came up here and ran those results I should be able to see that there are uh differences in the time itself so the datetime object in here is certainly different than this so already showing the aggregations at work is kind of the point I’m going to go ahead and move these import statements up a little bit uh but now that we’ve got that out of the way we’ve got a way to gate this data based off of buckets we want to do one more thing and that is I want to actually count the number of items in that time bucket or based off of that time bucket so the way we do that is we bring in from SQL Alchemy we’re going to bring in a function in here and so this function right next to the page is just going to be function. count and that’s it we can give it a label as well if we wanted to we can say something like event count in here and then we could reuse that label later but I’m just going to leave it as simple as possible by using function count if I go ahead and run this we get this other issue now so this is now not necessarily giving us all of the right data so I’m going to go ahead and get rid of this order by here altogether we’ll just use the select for now still not giving us exactly what we want now this is happening in part because we need to group the data somehow so how is this count going to work what is it actually counting so the way we do that then is we come in here and do group by and the way we want to group this is with the time and the page so those two things together will give us the grouping that we’re trying to count so I can go ahead and run this again now it’s showing me the grouping that we’re trying to count this is now looking pretty good I think so it’s giving us all of this data in here and what if we change this to 1 hour we should be able to see something very similar to that maybe we didn’t change very much in an hour let’s change it to one minute now we can see those chunkin coming through uh because of all that data that’s been coming through so if I were to run and send more data in it’s been more than a minute um I should be able to see a bunch of data coming through with these chunks as well we could always take this a little bit further as well but overall what we’ve got is now a Time bucket chunking and we probably also want to maybe order it so we’ll do order bu and in this case I’ll go ahead and say bucket and then we will maybe also do the page itself so the page model is being ordered and then uh or maybe not even in that ordering we can play around with these things a little bit oh we forgot a comma let’s run that again there we go and so now it’s in order right and we see and of course it will skip minutes right so if there’s no data in there it’s not going to have any chunks so it’s going to be skipping things which is what we’re seeing here now there is a way to do something called a gap fill which is definitely a function that the time scale DB package has uh so instead of using time bucket there’s time bucket Gap fill which allows you to fill in empty spots that are missing that might be missing uh by aggregating the data in my case I don’t actually want any empty data coming in here I’m just going to leave it as is because these are the actual data points and the actual counts I’m not going to guess what the counts might end up being now of course the actual analytics for a page would be a little bit different than this if we were actually grabbing web traffic right it’s not going to be consistently skipping minutes based off of the ingested data for all of the data it’ll just be for some of it okay so another thing about this is we can then narrow this down to the pages we want in here so for example if we go back into when we created this data I’m going to scroll up here I’ve got a few pages in here so I can go ahead and grab that and we can grab those pages and put them right in here as to like narrowing down the results that I want to have and the way we do that then is by adding a wear clause in here so we’ll do uh Dot where and inside of here we can then say that this page and then do in with an underscore one of these Pages all right so if we run that now it’s going to go based off of those pages and so if I were to do let’s just say the about page here those are what’s happening on that about page over that time over that time bucket so if we do one day not going to be that big of a deal right now but of course over time hopefully we’ll see those days come through and it’ll end up working based off of this data so it’s aggregating I think really well and it’s doing so with all the chunks that are available and all that so it’s really really nice and we have the ability to do all sorts of advanced querying in here the other part about this is our wear claw we could do it based off of date time as well so if we were to bring in additional Imports like from date time we can import date time and time Delta and then time zone right and so what we can say something like this where we say start equals to date time. now. time zone. UTC and then we can subtract time Delta and we could do something like hours equals to one and then we could do finish or end being basically the same thing and then we can just go ahead and add in another time Delta in here uh which would allow you to come in and say that the time so event model. time is greater than start comma and then event model. time is less than or equal to finish now that actually filters it down a little bit more but the time bucket is already doing some of that for us so if we were to go into the day and let’s say for instance it’s doing a Time bucket of the day but we actually narrow it down a little bit it’s probably going to give us different results so let’s see here we’ve got day the about page is 7500 but that’s happening within the last hour for today basically if I were to get rid of those and run it again it’s quite a bit more right so it is a the ability for us to filter down if we wanted to this I think is maybe potentially redundant to doing the time bucket because the time bucket already does that really effectively but you will see this from time to time especially when you go into the time bucket Gap fill then you might want to fill in the gaps of a specific interval of time now the thing about this bucket here is you could still do a lot of this stuff in standard uh postgres itself so these SQL queries can get a lot more complex of course um and then what we have here it just so happens that you know hypertable and more specifically time scale has some of these functions built into it that allow for it to happen efficiently so this kind of aggregation is very efficient and then we can take that compiled query let’s go ahead and get rid of the printed results here let’s take that compiled query from the bottom which would be pretty big so I’m going to go ahead and actually start from the top and hopefully grab that whole thing and we will bring it into popsql here and we’ll go ahead and run that new query which let’s make sure that it does finish with page at the end there it looks like it does then we’ll go ahead and run this and now it does basically that exact same query but we can just test it right inside of SQL itself uh which is exactly why we have that compiled query there in the first place um but of course you might not actually run through the SQL stuff very often you might just be running it with python but the underlying SQL is definitely still there and it works essentially the same way as SQL model does here so that’s pretty cool it’s a very straightforward way to do these aggregations and be able to start analyzing our data so all we really need to do is turn this into an API route that will allow for us to do basically this stuff so let’s take a look at how we do that now it’s time to bring in that time bucket querying and aggregating into fast API the way I’m going to do this is I’m going to actually remove our old list view our old events list view to being specifically just aggregations now when we start getting a lot of data it’s really unlikely that I’ll use the list view very often if at all to see individual items I could always bring it back later if I wanted to but realistically I want to see aggregations here this is going to be that lookup so the way this is going to work is by going into our model here we can grab all of the things related to the query I don’t need the compiled query and I will use results in here in just a little bit but the idea is bringing in this query and getting rid of what we had before so I’ll go ahead and paste this in this of course means that I’ll need to import several things which of course I already have some of the Imports on here so I’ll go ahead and bring those in as well as well and then we need the time scale DB import as well so let’s go ahead and do that too okay great so that should give us our new query in here and then the result here I’m going to go ahead and do fetch all and that is actually what I want to return now is just simply those results so that of course means that I need to update my response model so I’m going to leave some of the default arguments in here to start but I do want to change the response model now the response model itself is going to be very uh much based off of what we already saw in here so if we were to run this again we can see that it’s a datetime object the page and then some sort of count in here so that’s what I want to have as well but I want to be a little bit more specific about that before I create the event list schema or the re response model together so that means inside a select I want to add a label to each one of these this label we can go ahead and call it bucket this one is going to just simply going to be page and then this one right here is going to be labeled as count what do you know basically the same thing as what they are but just adding this label on here which is a characteristic of SQL model uh will make it really easy to then have a response model of some kind now let’s remember what’s happening here this query is going to be searching more than one it’s always going to return back a list which means that I also want to bring in from typing we’re going to import list here and we’re going to go ahead and use a list of some kind so back into models.py let’s go ahead and actually create our new schema which is going to be Loosely based off of this event schema not quite the same but count will be the same then we want to have page in here which is a string itself and then we’re going to have a bucket in here which is a datetime object right and so we already have that imported in here because of updated but the idea is this is now going to be our event uh we can call this our event bucket schema something like that and then we’ll go ahead and bring this back into our route we’ll use that as our import now and that’s what we’re going to list back that’s what’s going to be our new response to this request so not a huge difference now one of the things we could think about too is maybe we change this query to being in its own model or its own field here like maybe something like services. that’s not something I’m going to do right now because I really just want to work on what we got here now the other things I’m going to get rid of are this start and finish I don’t necessarily need that I’m just going to go based off of the bucket itself um you totally could do the start and finish in there but using just the bucket will allow for me to do these as query parameters which we’ll do very soon but now that we’ve got this API endpoint let’s just make sure that our server is running uh so we’ll go up here and it looks like we need to maybe save models up pi and there’s our oh we got a little little error there let me just resave that there we go so it looks like everything’s running both in python as well as Docker great so now I’m going to go ahead and make an API call which of course we already have that endpoint in there with list data types I can go ahead and run each one of these I might need to restart the server here or restart the notebook itself um but there’s my results right there right and of course these are off of one day so I want to be able to change this I want to be able to come into my request and do pams and say something like duration and we’ll go ahead and change that to 1 minute right I want that to be a URL parameter that I can change and maybe in here I also want to go ahead and say pages and add in something like slab and let’s see another one maybe contact right and so there we go great so those are the parameters I want to bring in now we can do this fairly straightforwardly inside of fast API which uses a query parameter so the query parameter we can bring in as page and this is going to be an st Str or rather not page but we called it duration and this is going to be a string of some kind which basically will be a default of one day okay so that query we need to bring in to from Fast API we’ll go ahead and import query in here and so having a default in there is nice because then I can use that as my parameter here let’s make sure we put a comma great okay so the next one is going to be our Pages this is going to be a list um but the query itself we’ll go ahead and have a default of none or maybe just a default of an empty list let’s just try with none for now um and we’ll leave it as is okay so this is going to be our default Pages now the pages itself I’m going to go ahead and call these lookup pages and I’ll have some defaults already in here we could always say something along the lines of our default lookup pages and set that up here so we can kind of readjust that as we see fit but we’ll have some default lookup pages in here that will basically be this right here and we’ll go ahead and use those inside of our query here for that lookup great so the idea then is inside of here is if there are Pages coming with the query we would say uh Pages basically and we would then want to check if is instance of a list and pages is greater or the length of pages is greater than than zero else we will go ahead and use this so a nice oneliner for the condition basically if all of these conditions are met it’s going to use those pages otherwise it’s going to use the default ones and then that’s now our lookup and now our API endpoint with that time bucket and all of those aggregations let’s just verify that this is working by going back into our notebook and there we go so we’ve got it at 1 minute here’s that one minute we could also verify this by taking a look at the time that is happening obviously if you had a lot more data it’d be even more clear uh then we can also say something like one week right so we can change it as we see fit and it shows only those items in here if I were to use an empty list here it’s going to then go ahead and use all of them that I have available in my default ones now in my default ones I’m actually missing one which is when I was playing around myself I didn’t actually put a uh leading slash on pricing as well so I did it in two ways so when you actually end up doing this you might want to have some default pages that are in there uh but overall once I actually add that other one and we can see that that one has a lot of data as well now this distribution is probably unlikely I doubt about page would you know be far greater than anything else uh but the idea here or even the contact page being that high right so the analytics here is the data points are actually only because when we created it it just did some pseudo random creation the actual data once you put this into action would look quite a bit different okay so pretty cool now we’ve got a actual API point to get the aggregations from our analytics we have a way to create the data we have a way to aggregate the data both things are what’s critical here and what we will be using once we actually put this thing into a deployed production environment we now want to augment our event model so it’s a a little bit closer to extracting real web data so like for example you want to know what web browser people are using to access these Pages or you want to know how long they spent on those pages now that is more realistic web analytic data now the point of this is to show you how you can add additional fields and what you would need to do to be able to use the time bucket equation to make sure that you can also include those fields and see how those might play out so the idea here then is inside of our event model we’re going to change it now before I make any changes here I’m going to go ahead and run the docker compos down- V to make sure that all of the data has been removed now the reason for this is because I don’t have database migrations in place to make these changes you could have database migrations in place I don’t so we’re going to go ahead and remove some of the fields that are in here by default I’m going to go ahead and get rid of the comments even for them because we no longer need those but I want to add in additional Fields the ones I’m going to add in are these right here we’ve got our user agent otherwise known as like the browser the web browser that’s being used their IP address to identify their machine itself um the refer like you know if it’s coming from Google or from its own website the session ID which you might have in there as well and then how long they spend now of course you could always have more data than just this as well so what we’re not going to do though is change this data we’re just going to ingest this raw data and we will be able to change it later that’s more specific to the user agent so that you can see the specific like operating system that’s being used as we’ll see in just a moment so now that we’ve got this event model I also want to get rid of some of the schemas that are going to be a little bit different or at least change them so the create schema is no longer going to have this description in here it’s just going to have these things the update schema I’m not going to allow updates any longer so I’ll go back into my routing and I will get rid of all of the things related to updating any of these events seeing how to update things was important but now we no longer need them so go ahead and get rid of that as well and of course you could always review the old commits to see all of that old data if for some reason you wanted to go back and see it so at this time we don’t actually have that much different in terms of our data itself so let’s go ahead and bring back Docker I’m going to go ahead and do Docker compose up watch and then I want to make sure that my models everything’s saved and I should see something along the lines of we’ve got a database in here great so these actual Fields themselves maybe you want to change them in the future maybe these aren’t actually the ones you’ll end up using that’s not really the point here the point here is how do we add fields to the data we want to collect and then how do we look at that in terms of our events themselves so this actually means that we need to modify the data we’re going to send so inside of our data here I’m going to go ahead and do a pip install Faker Faker is a python package that allows for us to have some fake data and I’m going to go ahead and grab some of the data that I have which is going to be just this import here I have additional pages in here now I have 10,000 events that I want to bring in Faker can allow for you know fake session IDs in here we can have however many that we want in this case I’m just going to use 20 but if you want to use 200 feel free to do it we still have the same AP in points I added a few refers in here and so we can actually start the process of building out this data so what we are going to do here is very similar to what we have up here where we’re going through a range of events and we’re going to get a random page which we have right there then we’re also going to go ahead and get a random user agent which we can use Faker for so Faker has all of these user agents in here that you can go off of right so just make sure that you have all of that fake stuff implemented then we’re going to go ahead and just create a payload based off of this data so the payload is going to be well simply these key value pairs in here it’s really just a dictionary but I like to looking at it this way uh just to make sure that it’s all working great Okay so we’ve got a duration maybe we do 1 to 300 there’s a lot of things that we could do in that realm but the point here is we now have a new payload that I can send back into my database or into my actual API service with all of that data so I’m going to go ahead and run it and there it goes so we can see all of the different data items that are coming through in here uh we see the user agent and all that okay so while that’s running I’m going to jump into my queries from the SQL model now actually what I want to see is the list data I’m going to go into the API request themselves and we’ll just go ahead and run this within that API request there we go we actually see the same sort of data coming back so if I say something like five minutes we should see a lot of that same data coming back as well now the pages are going to be a little bit different because of my original routing how I had it set up so let’s go ahead and grab these new pages in here as our new default pages so go ahead and copy this and then we’ll go jump into our routing and our new defaults are going to be those right there great now of course you don’t necessarily have to even narrow down the defaults you could remove that Al together which would allow for all of the pages to show up okay so with that in mind now that we’ve got that there we’re going to go ahead and run this once again in that list data types let’s go ahead and run it and there we go so now we have a bunch of different pages in here and we can see the contact and all that stuff okay so now what we want to do is we want to see this a little bit different so going back into the routing we want to add a new field in here and we want to count the number of instances for that field in other words most of this is basically the same so let’s go ahead and do that with let’s say for instance our user agent so if we came in here and we did event. user agent and then we added a label as something like UA like user agent we could do something along these lines where we’re now selecting that data if we go to group that data we totally can which would just be adding in that user agent and then we can order it by that data as well depending on how we see fit our result results now will be slightly different than before so we have to remember the way we are returning our data is going to play in right now so event bucket schema we need to make sure that we are using the label of UA make sure that that’s in there as well so the event schema here we’ll go ahead and bring it in as simply UA this might have a or not UI but UA this might have an optional value in here as well uh which you just set to an empty string maybe uh but we’ll go ahead and save that we’ll save this and then we’ll go back into our list data here and we’ll run it again and now what we see is these things being sort of collected together and right now it’s only showing one count for some of these different user agents so realistically this isn’t that great because it’s not really parsing the data the way I want to because there’s different versions and stuff well if you want to get super granular this is going to be great but we don’t want to get that granular that’s maybe too granular for us so what we want to do is take it a little little further and we’re going to change how we actually return back this data so back into our routing here what we can do is we can import another SQL Alchemy function here so I’ll come in and bring in something called case and we can add in something called a Operating System case so inside of my read events now I can come in here and say something like this where we’ve got our event model user agent and then it basically says hey if it’s one of these things it’s going to set it to something else and then that’s what it’s going to end up being otherwise it’s going to just mark it as other this now I can use as my user agent in here the label I’ll leave as operating system so we’ll go ahead and paste this in here then we will paste it here and here so basically taking place of our original user agent although it’s still the user agent once again I made some changes to the query which will affect the results which means that I need to update the schema itself in which case I’ll go ahead and copy the UA this time I’m going to go ahead and add in the operating system in here same sort of idea still optional let’s go ahead and leave that as is operating system is the label which is why I named it that way we have all these other labels for that same exact reason so there we go we save it let’s go back into our list view here do a quick search and now we’ve got a much better look at what this data could be we’ve got Android at the homepage Android at the about page how many times when all of that and we can keep going through and really see very very robust data so this concept here we could take even further right so we could use something like OS case to unpack different IP addresses so we can see the different parts of the world we can do all sorts of that within the query now I’m not going to go through those Advanced queries in this one but that is absolutely something You’ be able to do at this point let’s go ahead and take a look at how we might do the average duration so I’m going to go ahead and copy the operating system thing here and go ahead and do AVG duration this is going to be optional float and then we’ll go ahead and add in 0.00 or something like that where it’s an average duration value now back into our routing here what we can do to add that average duration is just put it into our select what you do that is funk. AVG then you grab the average of what that value would end up being which is the event model duration and then you set a label to it and it’s average duration and then there we go so now we have enriched this data with the average duration everything else doesn’t need a change because how we’re grouping it together is going to be based off of that operating system and that page and the time not the average duration that wouldn’t actually make any sense to group it by the Aver duration in this place okay so now that we’ve got that let’s go ahead and take a look inside of our list here if I do a quick search here um what I see is that average duration is coming through on each one of these and what we should be able to do is actually modify that to let’s just modify how we were sending this data in the first place by going back into send data here we’re going to change the average duration up by a lot I’ll go ahead and say 50 to uh 5,000 seconds which is definitely a much different look at the durations themselves which should give us a different response back to the actual data that we’re getting depending on our our you know duration parameter here so if I said every 15 minutes for our duration we run that our average durations will hopefully change at least somewhat now of course there would be a bunch of ways on how we can modify this but here’s another average duration that just popped up a lot that’s because of that change that we just did um there was a chance that that wouldn’t have happened but it’s overall really nice that we’re able to completely change how our analytics works and we can do it in a very short amount of time now this is where spending a bunch of time to build out more robust queries might be really useful we now have the foundation in place for a very powerful analytics API where we control all of the data that’s coming in and ignore the data we don’t want then we can also analyze this data with our very own queries now this data itself the actual SQL data that’s coming through the table design probably could be done in postres by itself but the real question here is how efficient or how effective will it be as your data set grows a lot and the answer is it won’t be the nice thing about time scale is you can just add it whenever you need to so if you want to start with just a SQL model you totally can do that now of course the package that we went through won’t necessarily automatically do that just yet maybe at some point it will but right now it doesn’t do that you kind of have to decide this from the get-go in terms of this particular package but in the long run in terms of time scale you can add this at any time and there’s a lot of options for that which is really nice and it gives us a lot of flexibility and the fact that it’s open source means that we can just activate it in our postgress database and We’re Off to the Races but what we want to see now is we want to take this to the next level which is actually deploying it so we can see it in production what it might actually function like on our own systems let’s take a look at how to do that in the next section we’re now going to deploy our application into production using time scale cloud and Railway now time scale Cloud itself will allow for us to use time scale but not worry about running time scale it’s a managed version so all of the performance gains the new releases the bug fixes all of that stuff is going to be in the managed version so we don’t have to worry about it at all now do keep in mind that time scale is based on postgres so it’s still just a postgres database so yeah we could still use the open source version of time scale if we want but it will be a lot more simple if we just go and use time scale Cloud directly so go to this link right here so I get credit for for it and they know that we should make some more videos covering all of these things then we’re going to be deploying our containerized application into Railway now we’ve already seen how to build out the containerized application so this process is going to be fairly straightforward as well so we’ll be able to integrate the two of them and just have it run and we’ll test all of that in this section let’s jump in before we go into production I’m going to go ahead and add in some Kors middleware this is cross origin resource sharing Kors allows us to prevent certain websites from accessing our API this also means certain HTTP methods as well now we’re actually going to open up the floodgates on it mostly because we can actually turn this app into a private networking resource in other words the other apps that would access it need to be inside of that same network that is deployed similar to like you can’t access my version on my computer because it’s in its own private Network that you are not a part of that’s kind of the same idea when we go into production so for that reason I’m going to go ahead and open up my course here so we’ll go ahead and do from fast api. middleware does we’re going to import the cores middleware and then underneath the app here I’m going to go ahead and do app. add middleware and it’s going to be our cores middleware like this and of course you can go into the fast API documentation and get all of the different arguments you might put in here in our casee we’re going to allow all Origins all methods all headers we’re letting everything come through but you know if you were going to expose this to the outside world you might want to lock down the origins to like your actual website domain you might only want to allow get and post methods in here maybe not even delete right so that’s something else that you can think about going forward but this is one part that I wanted to make sure we had before we went into full production so now what I’m going to do is I’m going to push this into GitHub so what we’ve got here is in my terminal I’ve got git status I can actually see all of the files that I’ve changed now I’m actually not going to show you this process mostly because I also have git remote- V I actually have the entire code that you’ll be able to use in the next part we’ll actually use this code directly make a few changes so we can do the deployment directly I won’t do anything else inside of this project at this time other than just adding those cores and that will be on GitHub just a moment now what we’ll do is take it from GitHub to deploy it at this time go ahead and log into GitHub or create an account if you don’t already have one this account that I’m using is really just for these tutorials then you’re going to want to go to cfsh GitHub this will take you to the coding for entrepreneurs profile in which case you’ll go into the repositories here and you’ll look for the analytics API repository of course do a quick search for it if you need to it’s it’s going to be this one right here now the point of this is really to just go ahead and Fork this into your own project here so you go ahead and create that fork and it’s going to bring the code on over now what we need to do is we need to add some additional configuration to our now forked project now that additional configuration has everything to do with deploying fast API so if you actually go to fastapi container.com it’s going to bring you to this in which case will allow you to scroll on down and you can see the code directly that’s going to be used here now this code is just boilerplate code to deploy fast API into Railway that’s it so in our case we really just want this railway. Json because it’s something I’ve worked out to make sure it works really well for you now I’m going to go ahead and copy the contents of this which you can you know select all of it or just press this button right here then I’m going to go back into my repo the one we just forked from the coding for entrepreneurs profile then we’re going to go ahead and hit add file we’re going to create a new file here and I’m going to call this railway. Json and I’m going to paste this on in here now before I commit the changes I will say that it would have been or it will be a good idea to have it on my local project as well I’m just sort of assuming that you are not using git or you haven’t been this whole time maybe you don’t know it yet but of course if you do know it you know what to do from here but the idea here is we need to adjust our Railway settings to ensure that this thing actually gets deployed now what we see is this build command this build command is looking for a Docker file that does not have a period it just says Docker file so we need to use Docker file. web because that’s our actual Docker file path which of course is going to
be the same for our watch patterns so we actually seen something like these watch patterns already composed igl has watch patterns what do you know now we actually didn’t update our composed yl very well this one probably should say web as well but the point here is we want to actually rebuild this application based off of these patterns here mostly for the SRC and requirements those are the main ones of course but Docker file is another one that’s important as well then we also have this deploy stuff this deploy stuff is looking for that health check do you remember when I said we will have a health check well if we actually look in the main.py code and scroll on down here’s that health check right there notice there’s not a traing slash here but this one is looking for a trailing slash let’s get rid of that TR trailing slash we want to make it the same as what’s in our code so now that we’ve got this I can go ahead and commit these changes and we’ll go ahead and say create railway. Json great our code is now ready for Railway there’s not really much else we need to do to deploy it so let’s go ahead and actually do that going to railway.com feel free to sign up for a free account jump into the dashboard notice that I’m on the hobby account this is a key part of this we’re going to go ahead and jump on over into our account settings and we want to make sure that our account Integrations are connected to GitHub so go through that process if you haven’t what you’ll end up seeing is something similar to this when you go through it and then it’s going to say hey what repositories do you want you could say all of them or you could select the one that you just forked which is what I’m going to do obviously I have two in here but I definitely want to have the one I just forked so I’ll go ahead and save that in here so that now I can go back into Railway and I can deploy this so let’s go ahead and do new now and notice that the analytics API is there it’s ready to go it’s ready for me to deploy it which is really cool it’s a very straightforward process in terms of the deployment but the key here is we want to look at our settings inside of these settings we should be able to scroll on down and we should see the Builder that it says build with a Docker file using buildkit mostly that it says Docker file. web in there if your does not say that that most likely means that your railway. Json in here is done incorrectly which might mean that that isn’t correct that’s kind of the idea beyond that we actually probably won’t need to change anything much notice that it has watch paths in there as well we should not have a start custom start command that’s not necessary one of the things we might need to change is the actual region but I’ll stick with this region for right now and then if you scroll on down notice the health checks in there the timeouts in there all of this stuff is in here things are looking pretty good okay great so what it actually does is it automatically starts deploying this might deploy it might not I actually think it would fail mostly because we don’t actually have our database yet so let’s go ahead and start spinning up our database and let’s combine the two in here assuming that you’ve already signed up for time scale Cloud you’ll log in and you’ll see something like this we’re going to go ahead and create our first database service right now so the idea of course is we’re using postgres so we can just go ahead and hit save and continue now we can select what region we want to use now the region you end up using will likely be close to you physically or close to where the most of your users are so if you’re doing this for somewhere else you’re going to want to put it in that region itself now Us West Oregon is the closest to me physically but it’s also the closest to where I actually have my app being deployed on Railway so if I come in here into my settings I can see the deployment is in California now there is one for organ in here as well so I could always change that too now that deployment is going to fail as well but I’m going to go ahead and leave it in as organ because inside of time scale I have the ability to use organ as well so now I’m going to go ahead and hit save and continue how much data I’m going to be using is practically none because we’re still early days we’re going to use devel velopment here and then we’re going to go ahead and hit save and continue and then we’re going to create this service and I’m just going to call the service name analytics API and then we’re going to go ahead and hit create service now we get a free trial here for 30 days which after that free trial it’s still very affordable in terms of the ability what we can do in here but notice we get a new connection string so I’m going to go ahead and copy this connection string and I’m going to bring it locally first so in myv down here I’m going to go ahead and say CRA DB URL and I’m going to paste that in now the reason I did this is twofold number one if I ever want to test my production URL locally I totally can now this also means that if you skipped using Docker compose this is the route you could do but then you would also use something like this where you actually change the connection string just slightly so realistically you would use something like that you would actually comment out this old one and then use this new one now my case I’m going to leave it as is and I’m going to go ahead and copy the entire new string with it commented out now I’ll go ahead and jump back into Railway into variables here and I can do this raw editor and just paste in those key value pairs right that’s it and then I’m also going to go ahead and update those variables as soon as I do that it’s going to attempt to deploy which I’ll go ahead and let it do now one other thing that’s important to note is how this is being deployed how it’s going to be run so if we go back into our Pro project a long time ago we created this Docker run file which has a run port and a host so this run Port is going to either be set by us or by Railway so we could set this port in here so inside of our variables here we could come in and do a new variable and we can set it to whatever we want I’m going to use 880 which I believe is the default on Railway but I’m going to go ahead and set that port in there as well and then we’ll go ahead and deploy that one as well so one of the nice things about railway is the ability to really quickly change your environment variables and then it will go ahead and build those containers for us and then run them for us very very similar to what we were doing with Docker compose in that sense now we also have this run host here now at some point in the future we might end up using this a little bit differently so at some point if you want a private access one you might need to put your run host the actual host itself inside of a variable as this value right here so you would do something along the lines of run host equ to that in which case I will go ahead and put this into my sample. EMV U you know compos file here just so we have it for later but that might be something if you don’t want to expose this to the outside world okay so after I did all those changes in a matter of minutes it was able to deploy the application itself and we can view the logs in deploy logs this will show us that it is running and notice the port in there says 8080 we also have our build l logs in here we could go through this this is building the container for us it happened really quickly and then it was did the health check which also happen really quickly and then of course it finally deployed so now what we can do is since it is fully deployed we need to of course access this you know API itself so inside of our settings here we go into networking we generate a custom domain now sometimes it might ask you what port you want to use sometimes it might not in this case it just defaulted to the environment variable Port that we have we might even be able to edit that looks like we can right in there which is really nice as well so the one thing that I want to check though is before I go any further is making sure that the deployment is in that region that I wanted and sure enough it is great so going back into our networking this is our API now this is the actual URL that we can use and here’s that hello world and of course if I go into SL API events the actual inpoint itself we should have no events which is very very straightforward so if if you are a casual viewer you will notice now this is a production deployment it’s fully production and it’s also using of course time scale if we go back into our time scale service here and go to the service overview we should see that it has been created right so we’ve got service information in here um we can see all of the things related to it by doing our Explorer in here we can see that there is a hypertable and what do you know there’s that event model all of that is working really well now of course we could always open up popsql and bring in that cloud-based version as well which of course is one of the things it’s built for and you can always test things out there you could also do these all of this stuff locally as well like we discussed okay so what we really need to do though is we need to test this endpoint the actual production endpoint and see if it’s working and how well it’s working let’s take a look let’s go ahead and send some test data to our production endpoint so what I’m going to do here is I’m going to copy the actual URL that I got which of course came directly from Railway it’s this URL here of course you could always have a custom domain as well but I’m going to go ahead and use the one that they gave us and then I’m going to jump into my local notebook here where it says send data to the API and I’m just going to go ahead and change our base URL to that exact value now I don’t actually want the trailing stuff in here I just want to make sure that I’m using htps and then the actual endpoint that it gave us always use htps if you can so now that I’ve got this I should be able to send out a bunch of data this one right here is not actually valid anymore we want to use the new data that we have uh based off of the new schemas that we set up and all the new models so here it is right here I’m going to go ahead and run it it is 10,000 events so in theory it should be able to send all of that data just fine it looks like it’s working fine now if I go into the production inpoint itself refresh an the air there’s some of that data it is now working and it’s working in a way that hopefully you have expected now this of course is now a production endpoint it is fully ready you are ready to deploy this all over the world if you want to and have it in all of your applications and just go wild with analytics but there’s still one more thing that I want to show you and that is how to use this privately now of course we could always sit here and wait for all of this to finish and have all of that data to come through I’ll let you play around with that at this point but let’s actually take a look at how we could deploy the analytics API as a private service so we can still use it this way but just not expose it to the outside world now it probably comes as no surprise that you don’t want to have your analytics API open to the outside world because then you might get events that aren’t accurate they’re not real so of course we need to change that we need to turn our analytics API into a private networking service only so you can have private networking in a lot of cloud services what we’re doing here is we’re really just removing the ability to have public networking which in the case of Railway we can just delete the actual endpoint itself and now there’s no public networking whatsoever but there is private networking and this is something that’s done by default in the case of Railway you have to use IPv6 which means that we need to update how we access this now we could spend a lot of time on Fast API itself to harden the system add security layers to allow only certain connections but as soon as we turn it into a private networking thing it just adds a whole layer of security right there so we don’t have to necessarily add all this additional stuff into our application so what we need to do though is we need to modify our Docker run command here and that’s going to be by changing the host I think I mentioned before that it was the Run host but it’s actually the host in here that we need to change and we need to change it to being this right here so that’s what we’re going to do now is we’re going to do host equals to that as our environment variable and it has to be these two Brack here this is how gunicorn is able to bind to IPv6 so IPv6 is that by default it’s ipv4 which is what this is right here so let’s go ahead and grab this and we’re going to go ahead and bring it into our analytics API as a variable we’re going to go ahead and bring it in here just like this go ahead and update it and deploy it now there’s a chance that this won’t work the reason that it might not work has to do with how this string is here it’s possible that this needed string substitution which we’ll see in just a little bit it’s also possible that it work just fine so once again we will see that in a little bit as well now it’s one thing to make it private and it’s another thing to make it private and still being able to access it so the thing about railway that makes it a little bit simpler is if we go into the settings on our application we can scroll on down to private networking and here is the new API endpoint that we can use inside of our Railway deployments the big question is going to be how do we access that without building out a whole another application I’m going to show you that in a moment so what we see on our API though is that it looks like it’s being deployed and it looks like we’re in good shape let’s look at our logs here more specifically our deploy logs it looks like it’s listening at that actual location so that’s actually really good everything else isn’t arrowing out we do see those print statements that’s a good sign so now we actually want to deploy something that will allow for us to test out these communications so I actually created a tool called Jupiter container.com which will take you to this right here which allows you to have a jupyter notebook in a server right inside of Railway in your own environment so the way we deploy this is by going back into Railway hit create into our project here so the important part of this is you are in the exact same project that you’ve been working on notice that I have a couple deleted in here this one I want to go in and I want to use it right next to the analytics API I do not want to deploy a new project which may happen if you just go to Jupiter container and deploy it directly from there so what we want to do then is come back into our application here we’re going to go ahead and create go into template look for Jupiter that’s with a py and Jupiter container you could probably use jupyter lab as well jupyter container works because I know it works that’s pretty much it um okay so the next thing is we want to add in a password here I’m just going to do abc123 that’s going to be a password this is going to be publicly accessible as we’ll see in a moment I’m going to go ahead and deploy this which might take a moment as well so it’s going to also have a volume on here for persistent storage which is actually kind of nice when it comes to wanting to use things okay so the next thing here is though going to be our variables now what I can do is I can add a variable to my Jupiter container that references this analytics API variable so let’s go ahead and do that I’m going to go ahead and hit new variable here and I’ll go ahead and say um let’s go ahead and call it analytics endpoint something like that and then this one I’m going to use dollar sign two curly brackets and I’ll type out analytics API and these are going to be the objects that we have the op and that’s going to be specifically the rail Ray private domain here so that’s the one that I want to have access to inside of this container which kind of connects them together so I’m going to go ahead and add that and we’ll go ahead and deploy it now these deployments do take a little bit of time because it is building a container and then it’s deploying the image based off of what whatever is going on but one of the main things here is as soon as I create that variable a line is created here showing that they are connected in terms of Railway this is important because the actual Jupiter container itself should have a public inpoint as we see right here so it already has one as far as the template is concerned without us doing anything else so that’s really my goal with this analytics API is to have it so easy that we can deploy it pretty much anywhere we need to as long as we have the necessary configuration across the board which of course would include our time scale DB in there as well so now what we want to do is just wait for this to finish so that we can log in to our service once it’s up and ready which we’ll come back to okay so the Jupiter container finished after a couple minutes let’s go ahead and jump into the actual URL for it there it is I did abc123 to log in of course if you forgot what that value was you can always go into your variable here and just look for this Jupiter password this container is meant to be destroyed almost as soon as you open it up so you can delete it at any time as the point so going back into that jupyter notebook here what we can do is we can jump into the volume and I can create a brand new notebook in here and we’re going to call this you know connect to analytics or something like that I’ll just call it connect to API now I’m going to go ahead and import OS here if you’re not familiar with jupyter notebooks well you probably are now all of these NBS here these are all Jupiter notebooks they’re just running inside of our cursor application so what I can do is I can use os. Environ doget and I should be able to print out the environment variable that we used for our analytics API endpoint so I’m going go ahead and do that and there it is right there notice there’s no htttp on there so that’s an important part of this as well so now what I want to do is import something called httpx httpx is basically like python request but it allows us to call the IP V6 endpoints themselves so our actual inpoint our base URL is going to be equal to http col slash this value right here which I’ll go ahead and set up here like this and just do a little magic here not really magic but you know we’ll go ahead and do some string substitution there’s our base URL let’s jump back into our send data to the API thing in here which I’m going to go ahead and copy this whole thing and we’re going to go ahead and bring it into our Jupiter notebook paste it in like that now I don’t have Faker in here I don’t think so I’m going to go ahead and install it so that installation is going to be a matter of pip install Faker and that should actually install it onto the jupyter notebook itself um and then we should be able to run this now of course the base URL here is now this one so I can go ahead and get rid of these two right here and I also want to use htps X not um using python requests so there’s going to be htbx and I think the API otherwise is basically the same let’s go ahead and give it a quick little run and we get a error in here for that one no service known in here so that’s a little bit of an issue that we will have to address before I do though I want to just take a look at the base URL and what do you know I didn’t do some string substitution so let’s go ahead and make sure we do the string substitution as well one would so that we have the correct base URL now I’m going to go ahead and run this and it still might fail now the reason it still might fail has to do with the port value we’ve got connection refused so when we actually deployed the public endpoint in here when we went through that process it asked for that Port value so this is why I actually mentioned it in the first place is so that we can understand that we do need to know the port value now you could go through the same process as the analytics API endpoint I already know what the port value is off top of my head and it’s 8080 so that’s the one we are going to go ahead and use and we want to use that Port right here this would be the same thing as if we were doing this locally as well and so now we should be able to see the same base URL I’m hoping this actually solves the problem it’s certainly possible that it still won’t solve the problem uh but of course if we look into our deployment for this analytics API we should see that it is running at that specific point this is the endpoint right here so instead of using whatever this is right here right we are using the actual private IP address name basically which is this right here okay or a DNS name rather okay so now that we’ve got that let’s go ahead and see if it starts to work it looks like we’ve got no response of okay so httpx doesn’t do the same thing let’s go ahead and actually just print out the data I’m going to go ahead and come back here and get rid of that and we’ll just go ahead and run this there’s that data coming back and it’s quite literally working in a private networking setting pretty awesome if you ask me so the other part about this of course is to verify this data I’m going to go ahead and stop this we don’t need to add in 10,000 things we can just do 10 for example we could also just G use the git command to grab that data as well so I’m going to go ahead and grab this response similar thing it’s going to be the create endpoint still because it’s the same endpoint to do.get and then we can get that data back and we can see what it looks like by just like that we’ll get rid of some of these comments run that and there you go we now have that data coming through as well now of course if we were to change the params let’s go ahead and change that in the case of the duration I think it was we’ll go ahead and say one day and that’s all all change and now the data is going to be a bit different right and we can also change the pages let’s go ahead and do that I’ll come in here and say pages and we go ahead and put in maybe just the uh let’s go ahead and do just the pricing page and we’ll see what that looks like and now it’s only doing the pricing data in there which is not nearly as many because well it’s going through one day which is you’re grabbing different operating systems over that day right so we saw that as well but the point here is we now have a private analytics API that is well deployed and leveraging a lot of cool technology to wrap up this section we are going to want to unload everything that we just did as in delete all of the deployed resources unless of course you intend to keep them in the case of time scale we’re going to come in here and just grab delete service and then we’re going to go ahead and write out delete go ahead and delete that service that of course will delete all of the data that went with it which of course is no big deal because we did a bunch of fake data we also want to jump into rail way itself and delete the entire project which we can do by coming in here whatever that project is you go into settings you go into danger and then you can remove each one of these items by typing out their names this of course is just to make sure that you clean up all the resources you might not be using in the future so I’m going to go ahead and delete both of these and there we go and then I’ll go ahead and delete the project which I think would also delete those other services but this process has changed a little bit since the last time I did it the idea here is we’ve got all of these things being deleted including the services we were just working on so at this time I won’t be able to access that Jupiter notebook at all any longer or our deployed production API endpoint but you now have the skills to be able to deploy it again and again and again because well the repo is now of course open source on github.com so feel free to go ahead and deploy your own analytics API and if you do change it and make it more robust than than what we have here I would love to check it out let me know hey thanks so much for watching hopefully you got a lot out of this now I will say that data analytics I think is going to get only more valuable as it’s going to be harder and harder to compete so it’s one of those things that I think I’m going to spend a lot more time in what I want to do next is really just kind of visualize the data we built I want to build out a dashboard and I encourage you to attempt to do the same thing after I have that Dash dashboard I really want to see how I can integrate it with an AI agent to see if I can communicate or have some chats about the analytics that’s going on but before I can even make use of chats or a dashboard I probably want to use this somewhere for real so those things are actually better and will help me make decisions either way it’s going to be an interesting technical problem and we’ll probably learn a lot and I hope to show you some of this stuff in the near future so thanks to time scale for partnering with me on this one and thanks again for watching I hope to see you again take care

By Amjad Izhar
Contact: amjad.izhar@gmail.com
https://amjadizhar.blog
Affiliate Disclosure: This blog may contain affiliate links, which means I may earn a small commission if you click on the link and make a purchase. This comes at no additional cost to you. I only recommend products or services that I believe will add value to my readers. Your support helps keep this blog running and allows me to continue providing you with quality content. Thank you for your support!

Leave a comment