Cloud-native, data onboarding architecture for the Google Cloud Public Datasets program

Overview

Public Datasets Pipelines

Cloud-native, data pipeline architecture for onboarding datasets to the Google Cloud Public Datasets Program.

Overview

public-datasets-pipelines

Requirements

Environment Setup

We use Pipenv to make environment setup more deterministic and uniform across different machines.

If you haven't done so, install Pipenv using the instructions found here. Now with Pipenv installed, run the following command:

pipenv install --ignore-pipfile --dev

This uses the Pipfile.lock found in the project root and installs all the development dependencies.

Finally, initialize the Airflow database:

pipenv run airflow initdb

Building Data Pipelines

Configuring, generating, and deploying data pipelines in a programmatic, standardized, and scalable way is the main purpose of this repository.

Follow the steps below to build a data pipeline for your dataset:

1. Create a folder hierarchy for your pipeline

mkdir -p datasets/DATASET/PIPELINE

[example]
datasets/covid19_tracking/national_testing_and_outcomes

where DATASET is the dataset name or category that your pipeline belongs to, and PIPELINE is your pipeline's name.

For examples of pipeline names, see these pipeline folders in the repo.

Use only underscores and alpha-numeric characters for the names.

2. Write your config (YAML) files

If you created a new dataset directory above, you need to create a datasets/DATASET/dataset.yaml config file. See this section for the dataset.yaml reference.

Create a datasets/DATASET/PIPELINE/pipeline.yaml config file for your pipeline. See this section for the pipeline.yaml reference.

If you'd like to get started faster, you can inspect config files that already exist in the repository and infer the patterns from there:

Every YAML file supports a resources block. To use this, identify what Google Cloud resources need to be provisioned for your pipelines. Some examples are

  • BigQuery datasets and tables to store final, customer-facing data
  • GCS bucket to store intermediate, midstream data.
  • GCS bucket to store final, downstream, customer-facing data
  • Sometimes, for very large datasets, you might need to provision a Dataflow job

3. Generate Terraform files and actuate GCP resources

Run the following command from the project root:

$ python scripts/generate_terraform.py \
    --dataset DATASET_DIR_NAME \
    --gcp-project-id GCP_PROJECT_ID \
    --region REGION \
    --bucket-name-prefix UNIQUE_BUCKET_PREFIX \
    [--env] dev \
    [--tf-apply] \
    [--impersonating-acct] IMPERSONATING_SERVICE_ACCT

This generates Terraform files (*.tf) in a _terraform directory inside that dataset. The files contain instrastructure-as-code on which GCP resources need to be actuated for use by the pipelines. If you passed in the --tf-apply parameter, the command will also run terraform apply to actuate those resources.

The --bucket-name-prefix is used to ensure that the buckets created by different environments and contributors are kept unique. This is to satisfy the rule where bucket names must be globally unique across all of GCS. Use hyphenated names (some-prefix-123) instead of snakecase or underscores (some_prefix_123).

In addition, the command above creates a "dot" directory in the project root. The directory name is the value you pass to the --env parameter of the command. If no --env argument was passed, the value defaults to dev (which generates the .dev folder).

Consider this "dot" directory as your own dedicated space for prototyping. The files and variables created in that directory will use an isolated environment. All such directories are gitignored.

As a concrete example, the unit tests use a temporary .test directory as their environment.

4. Generate DAGs and container images

Run the following command from the project root:

$ python scripts/generate_dag.py \
    --dataset DATASET_DIR \
    --pipeline PIPELINE_DIR \
    [--skip-builds] \
    [--env] dev

This generates a Python file that represents the DAG (directed acyclic graph) for the pipeline (the dot dir also gets a copy). To standardize DAG files, the resulting Python code is based entirely out of the contents in the pipeline.yaml config file.

Using KubernetesPodOperator requires having a container image available for use. The command above allows this architecture to build and push it to Google Container Registry on your behalf. Follow the steps below to prepare your container image:

  1. Create an _images folder under your dataset folder if it doesn't exist.

  2. Inside the _images folder, create another folder and name it after what the image is expected to do, e.g. process_shapefiles, read_cdf_metadata.

  3. In that subfolder, create a Dockerfile and any scripts you need to process the data. See the samples/container folder for an example. Use the COPY command in your Dockerfile to include your scripts in the image.

The resulting file tree for a dataset that uses two container images may look like

datasets
└── DATASET
    ├── _images
    │   ├── container_a
    │   │   ├── Dockerfile
    │   │   ├── requirements.txt
    │   │   └── script.py
    │   └── container_b
    │       ├── Dockerfile
    │       ├── requirements.txt
    │       └── script.py
    ├── _terraform/
    ├── PIPELINE_A
    ├── PIPELINE_B
    ├── ...
    └── dataset.yaml

Docker images will be built and pushed to GCR by default whenever the command above is run. To skip building and pushing images, use the optional --skip-builds flag.

5. Declare and set your pipeline variables

Running the command in the previous step will parse your pipeline config and inform you about the templated variables that need to be set for your pipeline to run.

All variables used by a dataset must have their values set in

  [.dev|.test]/datasets/{DATASET}/{DATASET}_variables.json

Airflow variables use JSON dot notation to access the variable's value. For example, if you're using the following variables in your pipeline config:

  • {{ var.json.shared.composer_bucket }}
  • {{ var.json.parent.nested }}
  • {{ var.json.parent.another_nested }}

then your variables JSON file should look like this

{
  "shared": {
    "composer_bucket": "us-east4-test-pipelines-abcde1234-bucket"
  },
  "parent": {
    "nested": "some value",
    "another_nested": "another value"
  }
}

6. Deploy the DAGs and variables

Deploy the DAG and the variables to your own Cloud Composer environment using one of the two commands:

$ python scripts/deploy_dag.py \
  --dataset DATASET \
  --composer-env CLOUD_COMPOSER_ENVIRONMENT_NAME \
  --composer-bucket CLOUD_COMPOSER_BUCKET \
  --composer-region CLOUD_COMPOSER_REGION \
  --env ENV

Testing

Run the unit tests from the project root as follows:

$ pipenv run python -m pytest -v

YAML Config Reference

Every dataset and pipeline folder must contain a dataset.yaml and a pipeline.yaml configuration file, respectively:

Best Practices

  • When running scripts/generate_terraform.py, the argument --bucket-name-prefix helps prevent GCS bucket name collisions because bucket names must be globally unique. Use hyphens over underscores for the prefix and make it as unique as possible, and specific to your own environment or use case.

  • When naming BigQuery columns, always use snake_case and lowercase.

  • When specifying BigQuery schemas, be explicit and always include name, type and mode for every column. For column descriptions, derive it from the data source's definitions when available.

  • When provisioning resources for pipelines, a good rule-of-thumb is one bucket per dataset, where intermediate data used by various pipelines (under that dataset) are stored in distinct paths under the same bucket. For example:

    gs://covid19-tracking-project-intermediate
        /dev
            /preprocessed_tests_and_outcomes
            /preprocessed_vaccinations
        /staging
            /national_tests_and_outcomes
            /state_tests_and_outcomes
            /state_vaccinations
        /prod
            /national_tests_and_outcomes
            /state_tests_and_outcomes
            /state_vaccinations
    
    

    The "one bucket per dataset" rule prevents us from creating too many buckets for too many purposes. This also helps in discoverability and organization as we scale to thousands of datasets and pipelines.

    Quick note: If you can conveniently fit the data in memory, the data transforms are close-to-trivial and are computationally cheap, you may skip having to store mid-stream data. Just apply the transformations in one go, and store the final resulting data to their final destinations.

Comments
  • Feat: Onboard New york taxi trips dataset

    Feat: Onboard New york taxi trips dataset

    Description

    dataset: new_york_taxi_trips pipelines: tlc_green_trips, tlc_yellow_trips

    Checklist

    Note: If an item applies to you, all of its sub-items must be fulfilled

    • [x] (Required) This pull request is appropriately labeled
    • [x] Please merge this pull request after it's approved
    • [x] I'm adding or editing a dataset
      • [ ] The Google Cloud Datasets team is aware of the proposed dataset
      • [ ] I put all my code inside datasets/new_york_taxi_trips> and nothing outside of that directory
    opened by nlarge-google 11
  • feature: Initial implementation for austin_311.311_service_requests

    feature: Initial implementation for austin_311.311_service_requests

    "Pipeline for austin_311.311_Service_Requests"

    Description

    v2 architecture implementation of 311_service_requests in austin, TX. This implements the first version of the csv transform python script.

    Based on #

    Note: It's recommended to open an issue first for context and discussion.

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [ ] Please merge this PR for me once it is approved.
    • [ ] If this PR adds or edits a feature, I have updated the README accordingly.
    • [ ] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.
    • [ ] If this PR adds or edits a dataset or pipeline, I put all my code inside datasets/<YOUR-DATASET> and nothing outside of that directory.
    • [ ] If this PR adds or edits a dataset or pipeline that I'm responsible for maintaining, my GitHub username is in the CONTRIBUTORS file.
    • [ ] This PR is appropriately labeled.
    cla: yes 
    opened by nlarge-google 11
  • feat: Onboard NOAA

    feat: Onboard NOAA

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [x] Please merge this PR for me once it is approved.
    • [x] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.
    • [x] If this PR adds or edits a dataset or pipeline, I put all my code inside datasets/noaa and nothing outside of that directory.
    • [x] This PR is appropriately labeled.
    data onboarding cla: yes 
    opened by nlarge-google 7
  • feat: Onboard EPA historical air quality dataset

    feat: Onboard EPA historical air quality dataset

    Description

    Included: Annual summaries CO Daily Summary CO Hourly Summary HAP Daily Summary HAP Hourly Summary Lead Daily Summary NO2 Daily Summary NO2 Hourly Summary NONOxNOy Daily Summary NONOxNOy Hourly Summary Ozone Daily Summary Ozone Hourly Summary PM 10 Daily Summary PM10 Hourly Summary PM25 Frm Hourly Summary PM25 NonFrm Daily Summary PM25 NonFrm Hourly Summary PM25 Speciation Daily Summary PM25 Speciation Hourly Summary Pressure Daily Summary Pressure Hourly Summary RH and DP Daily Summary RH and DP Hourly Summary SO2 Daily Summary SO2 Hourly Summary Temperature Daily Summary Temperature Hourly Summary VOC Daily Summary VOC Hourly Summary Wind Daily Summary Wind Hourly Summary

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [x] Please merge this PR for me once it is approved.
    • [x] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.
    • [x] If this PR adds or edits a dataset or pipeline, I put all my code inside datasets/epa_historical_air_quality and nothing outside of that directory.
    • [x] This PR is appropriately labeled.
    cla: yes 
    opened by nlarge-google 6
  • feat: Onboard San Francisco Bikeshare Trips

    feat: Onboard San Francisco Bikeshare Trips

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [x] Please merge this PR for me once it is approved.
    • [x] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.
    • [x] If this PR adds or edits a dataset or pipeline, I put all my code inside san_francisco_bikeshare_trips/bikeshare_trips and nothing outside of that directory.
    • [x] This PR is appropriately labeled.
    cla: yes 
    opened by nlarge-google 6
  • feat: Onboard Census opportunity atlas tract outcomes

    feat: Onboard Census opportunity atlas tract outcomes

    Description

    Tract Outcomes

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [x] Please merge this PR for me once it is approved.
    • [x] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.
    • [x] If this PR adds or edits a dataset or pipeline, I put all my code inside datasets/census_opportunity_atlas> and nothing outside of that directory.
    • [x] This PR is appropriately labeled.
    opened by nlarge-google 5
  • feat: Onboard Census Bureau International Dataset

    feat: Onboard Census Bureau International Dataset

    Description

    Based on #

    Note: It's recommended to open an issue first for context and discussion.

    Checklist

    Note: Delete items below that aren't applicable to your pull request.

    • [x] Please merge this PR for me once it is approved.

    • [x] If this PR adds or edits a dataset or pipeline, it was reviewed and approved by the Google Cloud Public Datasets team beforehand.

    • [x] If this PR adds or edits a dataset or pipeline, I put all my code inside datasets/census_bureau_international and nothing outside of that directory.

    • [x] This PR is appropriately labeled.

    data onboarding cla: yes 
    opened by vasuc-google 5
  • Containerize custom tasks

    Containerize custom tasks

    Note: The following is taken from @tswast's recommendation on a separate thread.

    What are you trying to accomplish?

    One of the Airflow "gotchas" is that workers share resources with the scheduler, so any "real work" that uses CPU and/or memory can cause slowdowns in the scheduler or even instability if memory is used up.

    The recommendation is to do any "real work" in one of:

    What challenges are you running into?

    In the generated DAG, I see the following operator:

        # Run the custom/csv_transform.py script to process the raw CSV contents into a BigQuery friendly format
        process_raw_csv_file = bash_operator.BashOperator(
            task_id="process_raw_csv_file",
            bash_command="SOURCE_CSV=$airflow_home/data/$dataset/$pipeline/{{ ds }}/raw-data.csv TARGET_CSV=$airflow_home/data/$dataset/$pipeline/{{ ds }}/data.csv python $airflow_home/dags/$dataset/$pipeline/custom/csv_transform.py\n",
            env={'airflow_home': '{{ var.json.shared.airflow_home }}', 'dataset': 'covid19_tracking', 'pipeline': 'city_level_cases_and_deaths'},
        )
    

    I haven't looked closely at the csv_transform.py script yet, but I'd expect it to use non-trivial CPU / memory resources.

    For custom Python scripts such as this, I'd expect us to use the KubernetesPodOperator, where the work is scheduled on a separate node pool.

    Checklist

    • [x] I created this issue in accordance with the Code of Conduct.
    • [x] This issue is appropriately labeled.
    feature request 
    opened by adlersantos 5
  • Feat: Onboard Mimiciii dataset

    Feat: Onboard Mimiciii dataset

    Description

    This is to onboard mimiciii dataset with 25 pipelines using Airflow v2 operators only.

    Checklist

    • [x] (Required) This pull request is appropriately labeled
    • [x] Please merge this pull request after it's approved

    Use the sections below based on what's applicable to your PR and delete the rest:

    Feature

    • [ ] I'm adding or editing a feature
    • [ ] I have updated the README accordingly
    • [ ] I have added/revised tests for the feature

    Data Onboarding

    • [x] I'm adding or editing a dataset
    • [x] The Google Cloud Datasets team is aware of the proposed dataset
    • [x] I put all my code inside datasets/mimiciii and nothing outside of that directory

    Code cleanup or refactoring

    • [x] I'm refactoring or cleaning up some code
    data onboarding 
    opened by Naveen130 4
  • Refactor: Combine New York pipelines into one

    Refactor: Combine New York pipelines into one

    Description

    These are changes and clean-up to the existing dataset pipelines for new-york

    311_service_requests citibike_stations nypd_mv_collisions

    Checklist

    Note: If an item applies to you, all of its sub-items must be fulfilled

    • [x] (Required) This pull request is appropriately labeled
    • [x] Please merge this pull request after it's approved
    • [x] I'm adding or editing a dataset
      • [x] The Google Cloud Datasets team is aware of the proposed dataset
      • [x] I put all my code inside datasets/new_york and nothing outside of that directory
    • [x] I'm refactoring or cleaning up some code
    opened by nlarge-google 4
  • Feat: Onboard SEC Failure to Deliver dataset

    Feat: Onboard SEC Failure to Deliver dataset

    Checklist

    Note: If an item applies to you, all of its sub-items must be fulfilled

    • [x] (Required) This pull request is appropriately labeled
    • [x] Please merge this pull request after it's approved
    • [x] I'm adding or editing a dataset
      • [x] The Google Cloud Datasets team is aware of the proposed dataset
      • [x] I put all my code inside datasets/sec_failure_to_deliver and nothing outside of that directory pipelines/tree/main/tests) folder)
    • [x] I'm refactoring or cleaning up some code
    opened by nlarge-google 4
  • chore(deps): update dependency black to v22.12.0

    chore(deps): update dependency black to v22.12.0

    Mend Renovate

    This PR contains the following updates:

    | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | black (changelog) | ==22.10.0 -> ==22.12.0 | age | adoption | passing | confidence |


    Release Notes

    psf/black

    v22.12.0

    Compare Source

    Preview style
    • Enforce empty lines before classes and functions with sticky leading comments (#​3302)
    • Reformat empty and whitespace-only files as either an empty file (if no newline is present) or as a single newline character (if a newline is present) (#​3348)
    • Implicitly concatenated strings used as function args are now wrapped inside parentheses (#​3307)
    • Correctly handle trailing commas that are inside a line's leading non-nested parens (#​3370)
    Configuration
    • Fix incorrectly applied .gitignore rules by considering the .gitignore location and the relative path to the target file (#​3338)
    • Fix incorrectly ignoring .gitignore presence when more than one source directory is specified (#​3336)
    Parser
    • Parsing support has been added for walruses inside generator expression that are passed as function args (for example, any(match := my_re.match(text) for text in texts)) (#​3327).
    Integrations
    • Vim plugin: Optionally allow using the system installation of Black via let g:black_use_virtualenv = 0(#​3309)

    Configuration

    📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

    🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

    Rebasing: Never, or you tick the rebase/retry checkbox.

    🔕 Ignore: Close this PR and you won't be reminded about these updates again.


    • [ ] If you want to rebase/retry this PR, check this box

    This PR has been generated by Mend Renovate. View repository job log here.

    dependencies 
    opened by renovate-bot 0
  • Fix: Onboard HRRR processes in NOAA ETL

    Fix: Onboard HRRR processes in NOAA ETL

    Description

    Notes:

    • If you are adding or editing a dataset, please specify the dataset folder involved, e.g. datasets/google_trends.
    • If you are an external contributor, please contact the Google Cloud Datasets team for your proposed dataset or feature.
    • If you are adding or editing a dataset, please do it one dataset at a time. Have all the code changes inside a single datasets/noaa folder.
    opened by nlarge-google 0
  • chore(deps): update dependency pandas-gbq to v0.18.0

    chore(deps): update dependency pandas-gbq to v0.18.0

    Mend Renovate

    This PR contains the following updates:

    | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | pandas-gbq | ==0.17.9 -> ==0.18.0 | age | adoption | passing | confidence |


    Release Notes

    googleapis/python-bigquery-pandas

    v0.18.0

    Compare Source

    Features
    • Map "if_exists" value to LoadJobConfig.WriteDisposition (#​583) (7389cd2)

    Configuration

    📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

    🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

    Rebasing: Never, or you tick the rebase/retry checkbox.

    🔕 Ignore: Close this PR and you won't be reminded about these updates again.


    • [ ] If you want to rebase/retry this PR, check this box

    This PR has been generated by Mend Renovate. View repository job log here.

    dependencies 
    opened by renovate-bot 1
  • chore(deps): update dependency flake8 to v6

    chore(deps): update dependency flake8 to v6

    Mend Renovate

    This PR contains the following updates:

    | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | flake8 (changelog) | ==5.0.4 -> ==6.0.0 | age | adoption | passing | confidence |


    Release Notes

    pycqa/flake8

    v6.0.0

    Compare Source


    Configuration

    📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

    🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

    Rebasing: Never, or you tick the rebase/retry checkbox.

    🔕 Ignore: Close this PR and you won't be reminded about these updates again.


    • [ ] If you want to rebase/retry this PR, check this box

    This PR has been generated by Mend Renovate. View repository job log here.

    dependencies 
    opened by renovate-bot 0
  • Feat: Onboard Af dag notifications

    Feat: Onboard Af dag notifications

    Description

    Notes:

    • If you are adding or editing a dataset, please specify the dataset folder involved, e.g. datasets/google_trends.
    • If you are an external contributor, please contact the Google Cloud Datasets team for your proposed dataset or feature.
    • If you are adding or editing a dataset, please do it one dataset at a time. Have all the code changes inside a single datasets/af_dag_notifications folder.
    opened by nlarge-google 0
Releases(v5.2.0)
Pack up to 3MB of data into a tweetable PNG polyglot file.

tweetable-polyglot-png Pack up to 3MB of data into a tweetable PNG polyglot file. See it in action here: https://twitter.com/David3141593/status/13719

David Buchanan 2.4k Dec 29, 2022
EduuRobot Telegram bot source code.

EduuRobot A multipurpose Telegram Bot made with Pyrogram and asynchronous programming. Requirements Python 3.6+ An Unix-like operating system (Running

Amano Team 119 Dec 23, 2022
Discord Online Account Forever

💠 Discord-Online-Account-Forever Discord Online Account Forever 📸 Tutorial Token Discord NEVER SHARE YOUR TOKEN Installation Replit 🧿 Replit : Here

nimaisox 2 Nov 28, 2021
A webhook API for Discord.

Webhook API A webhook API for Discord. Requirements requests Usage

1 Feb 08, 2022
Download nitro generator that generates free nitro code that you can use for Discord

Download nitro generator that generates free nitro code that you can use for Discord, run it and wait for free nitro to come

Umut Bayraktar 154 Jan 05, 2023
CVE-2021-39685 Description and sample exploit for Linux USB Gadget overflow vulnerability

CVE-2021-39685 Description and sample exploit for Linux USB Gadget overflow vulnerability

8 May 25, 2022
This repository contains the best Data Science free hand-picked resources to equip you with all the industry-driven skills and interview preparation kit.

Best Data Science Resources Hey, Data Enthusiasts out there! Finally, after lots of requests from the community I finally came up with the best free D

Mohit Kumar 415 Dec 31, 2022
This bot will delete messages containing blacklisted words in your telegram groups.

Profanity Detector Bot This bot will delete messages containing blacklisted words in your telegram groups. Made using ProfanityDetector.

Aditya 17 Oct 08, 2022
AminoLab Library For AminoApps using aminoapps.com/api

AminoLab AminoLab Api For AminoApps using aminoapps.com/api Installing pip install AminoLab Example #Login import AminoLab client = AminoLab.Client()

10 Sep 26, 2022
Cogs for RedDiscord-Bot V3

Cogs v3 Disclaimer: This is an unapproved repo, meaning no one has formally reviewed this repo yet and any loss of data in your bot isn't my fault (An

Honkertonken 5 Nov 17, 2022
Bancos de Dados Relacionais (SQL) na AWS com Amazon RDS

Bancos de Dados Relacionais (SQL) na AWS com Amazon RDS Repositório para o Live Coding DIO do dia 24/11/2021 Serviços utilizados Amazon RDS AWS Lambda

Cassiano Ricardo de Oliveira Peres 4 Jul 30, 2022
Python client for Vektonn

Python client for Vektonn Installation Install the latest version: $ pip install vektonn Install specific version: $ pip install vektonn==1.2.3 Upgrad

Vektonn 16 Dec 09, 2022
A file-based quote bot written in Python

Let's Write a Python Quote Bot! This repository will get you started with building a quote bot in Python. It's meant to be used along with the Learnin

1 Jan 19, 2022
PunkScape Discord bot to lookup rarities, create diptychs and more.

PunkScape Discord Bot A Discord bot created for the Discord server of PunkScapes, a banner NFT project. It was intially created to lookup rarities of

Akuti 4 Jun 24, 2022
Assassination API for getting random quotes from Assassination Classroom.

Assassination API Take advantage of what you have, while you have it. Quotes from Assassination Classroom Assassination classroom is one of best anime

Swanand Mulay 3 Jul 15, 2022
A simple Discord Mass Dm with Scraper

Python-Mass-DM A simple Discord Mass Dm with Scraper If Member Scraper in Taliban.py doesn't work. You can DM me cuz that scraper is for tokens that g

RyanzSantos 4 Sep 02, 2022
un outil pour bypasser les code d'états HTTP négatif coté client ( 4xx )

4xxBypasser un outil pour bypasser les code d'états HTTP négatif coté client ( 4xx ) Liscence : MIT license Creator Installation : git clone https://g

21 Dec 25, 2022
This repository are used to give class about AWS

AWSTraining This repository are used to give class about AWS by Marco Antonio Pereira Linkedin: https://www.linkedin.com/in/marcoap To see the types o

Marco Antonio Pereira 6 Nov 23, 2022
Unauthenticated enumeration of services, roles, and users in an AWS account or in every AWS account in existence.

Quiet Riot 🎶 C'mon, Feel The Noise 🎶 An enumeration tool for scalable, unauthenticated validation of AWS principals; including AWS Acccount IDs, roo

Wes Ladd 89 Jan 05, 2023
Python wrapper for Coinex APIs

coinexpy - Python wrapper for Coinex APIs Through coinexpy you can simply buy or sell crypto in your Coinex account Features place limit order place m

Iman Mousaei 16 Jan 02, 2023