Titanic dataset github This project explores the infamous Titanic dataset to uncover insights into the tragic sinking of the Titanic and predict survival outcomes of its passengers. 31 %. Exploring kaggle's titanic dataset. The model predicts From a very basic analysis we can see that according to our subset: 38. This repo contain Titanic datasets in different formats. You switched accounts on another tab or window. Contribute to selimscode/kaggle-titanic development by creating an account on GitHub. the mean or median). push() will list any datasets that are in your data/raw or data/processed folder and ask whether you want to upload them all or select individual sets for upload. Implemented Logistic Regression for Titanic Dataset for Classifying whether or not a person survived the sinking of the Titanic. txt: Lists necessary Python packages for running the code. GitHub is where people build software. This dataset has passenger information who boarded the Titanic along with other information like survival status, Class, Fare, and other variables. Based on: Microsoft Azure (data science masterclass), Kaggle, Pandas, Sci-kit Learn and Seaborn This notebook is a simple example where I incorporate both historical and fictionalized aspects from the 1997 epic romance and disaster movie directed by James Cameron, starring Leonardo DiCaprio and Kate Contribute to 12345k/titanic-dataset development by creating an account on GitHub. Obtained a accuracy of 80. 4. Various variables present in the dataset includes data of age, sex, fare, ticket etc. This project analyzes the Titanic dataset to explore factors influencing passenger survival rates during the tragic sinking of the RMS Titanic in 1912. A public repo of datasets. Datasets are not shared automatically, but you can share them to your project's Swift container with a simple command. The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like. 2. Resources An exploratory data analysis (EDA) project on the Titanic dataset to uncover insights into passenger survival rates. titanic_dataset Leveraging the Titanic dataset, learn the ins and outs of transitioning from research to production. This dataset comes from the Titanic Kaggle competition. . Testing different ML models on famous Titanic dataset from This project uses machine learning techniques to predict the survival of passengers on the Titanic. The dataset is split 80% as training dataset and 20% as test or validation set. More people of Pclass 1 survived than died (First peak of red is higher Titanic: Requirements: python3,Libraries,Modules. - GitHub - illi4/Titanic_dataset: An analysis of titanic dataset from Kaggle using Python pandas and mathplotlib. To associate your repository with the titanic-dataset topic Contribute to rashida048/Datasets development by creating an account on GitHub. plot(kind='bar',figsize=(5,3)) """In above graph the the number of male and female survived from different Embarked. Using the given dataset, we perform data preprocessing, exploratory data analysis (EDA), and train machine learning models like Logistic Regression and Random Forest. On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Initial data exploration and cleansing procedures were employed to remove irrelevant columns, address missing values, and rectify data types for accurate analysis datascience powerbi tableau sample datasets. Contribute to Joyceumoh/Titanic_dataset development by creating an account on GitHub. A python script to classify Titanic dataset (Survived and not Survived) by applying different Machine Learning Algorithms have been used such as Logistic Regression, SVM, KNN, Decision Tree and Random Forest. com (requires opening an account with Kaggle). The Titanic datasetis also the subject of the introductory competition on Kaggle. Pclass: Passenger class (1 = 1st, 2 = 2nd, 3 = 3rd). The number of rows and columns in the dataset is printed. Passengers in first class had a significantly higher survival rate compared to those in lower classes, highlighting the disparities in access to resources and Titanic dataset (cleaned) from Kaggle is chosen. It includes scripts for data preprocessing, model training, and evaluation, aiming to predict survival based on factors like age, gender, and ticket class - IsAnshika/Titanic The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. This sensational tragedy shocked the international community and led to better safety regulations for ships. Jul 13, 2024 · This repository explores the Titanic dataset, analyzing passenger demographics and survival outcomes through machine learning models and data visualization techniques. I have worked with the famous Titanic dataset from Kaggle which contains two different files, train. csv` dataset contains similar information but does not disclose the “ground truth” for each passenger. Once all the database objects have been setup, download the import files from this repository into a location. This Repo contains ML models build for Famous dataset which is titanic dataset. Identify Categorical and Numerical Columns: Classify columns into categorical and numerical types to facilitate analysis. Titanic Dataset PPT. This dataset contains information about passengers aboard the Titanic, including survival status. However, at the end we found that the Kernel SVM has the best accuracy which is about 83. - GitHub - geodra/Titanic-Dataset: A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Survived: Survival status (0 = No, 1 = Yes). You signed in with another tab or window. Using the patterns you find in the train. csv). The analysis aims to explore various aspects of the dataset to gain insights into passenger demographics, family relationships, fare distribution, and survival rates. This Repo contains ML models build for Famous dataset which is titanic dataset where you are ment to find survived vs not survived - Titanic-Dataset/README. The attributes having continuous numerical values are converted into categorical variables. GitHub Gist: instantly share code, notes, and snippets. This repository contains an in-depth analysis of the Titanic dataset, a popular dataset used for exploring predictive modeling and statistical analysis techniques. This data set consists of information about the passengers of the RMS Titanic ship And also have info about is that particular passenger has survived that disaster or not. The project aims to explore various factors that affected the survival rates of passengers aboard the Titanic and to build a predictive model to determine the likelihood of survival. csv. ipynb: A notebook containing a machine learning analysis of the Titanic dataset. pptx: A PowerPoint presentation that details the analysis of the Titanic dataset. Contribute to YBI-Foundation/Dataset development by creating an account on GitHub. Exploring the Titanic Dataset with R Scripts to explore the Titanic Dataset, with R. Master modularization, code standards, and scalability for successful machine learning deployments. In Part 2,taking care of the missing values. " Contribute to YBI-Foundation/Dataset development by creating an account on GitHub. Leveraged Pandas, Matplotlib, Seaborn, and SQLite to uncover insights into passenger demographics and survival rates. Using Python and its powerful data analysis libraries such as Pandas, Matplotlib, and Seaborn, the analysis focuses on answering specific questions about survival rates and their relationship to titanic full dataset. Oct 23, 2024 · survive_sex=pd. The Titanic Survival Prediction project uses machine learning to predict passengers' survival chances from the Titanic disaster. This project aims to predict the survival of passengers aboard the Titanic using the Naive Bayes classifier algorithm. #Problem Statement The goal is to analyze the Titanic dataset to: Explore Oct 6, 2024 · Read the Dataset: Load the Titanic dataset from a CSV file for analysis. The study examines the Titanic Aug 29, 2014 · Dataset(titanic. Reload to refresh your session. We will implement three different machine learning algorithms: Logistic Regression, Random Forest Classifier, and Gradient Boost Classifier to make predictions. Building a machine learning model for the famous titanic dataset using the several different classifiers. The unfortunate event which was occurred on 15 April 1912, the Titanic sank after colliding with an iceberg, aboard 2224 peoples. - getyrno/kaggle-titanic Titanic - Machine Learning from Disaster Kaggle Competiton - NyistMilan/Titanic-Dataset This project analyzes the Titanic dataset to uncover insights into the factors that influenced passenger survival. Titanic DataSet. Sex: Gender of the passenger. csv) survived. The goal of this visualization is to provide The code begins by loading the Titanic dataset from a CSV file named "titanic_ds. Jul 31, 2024 · titanic_dataset. 6. notebooks/: Contains Jupyter Notebooks for analysis and model training. By analyzing key variables such as age, gender, and class, we aim to visualize relationships between passenger characteristics and survival rates. txt), added in the repository. Titanic Dataset - Train. Posed several questions about the Titanic dataset, then This dataset has passenger information who boarded the Titanic along with other information like survival status, Class, Fare, and other variables. Titanic Dataset Analysis and Visualization This repository contains a comprehensive analysis of the Titanic dataset using Python. 05 % and a standard deviation of 3. md at main · gakhil117/Titanic-Dataset Aug 2, 2024 · Contribute to HinaIsmail/Titanic-Dataset development by creating an account on GitHub. This project focuses on visualizing the Titanic dataset using Tableau to analyze various aspects of the passengers aboard the Titanic. Objective The objective of this project is to perform Exploratory Data Analysis to understand the characteristics of the Titanic dataset, find correlations between variables, and identify factors influencing survival rates. In the first part, understanding the Titanic dataset and perform exploratory data analysis. Data Inspection: Examine the first few rows of the dataset and check the data types of each column to understand its structure. Calling Dataset. Import the data using the following statements Uncover the secrets of deploying ML models in production with this tutorial. PassengerId: Unique ID for each passenger. We used the famous Titanic dataset from Kaggle, which includes information such as passenger class, age, sex, fare, and other features to predict whether a passenger survived or not. The files are from Kaggle, however they have the headers removed because the 'bulk insert' process in SQL server was having issues with these. The dataset includes information about the passengers, such as their age, sex, embarkation port, passenger class, ticket fare, and whether they survived the disaster. Mean/ Median / Mode imputation: These are the simplest imputation methods, and, at most, involve calculating some simple column statistics. csv This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. The dataset consists of the information about people boarding the famous RMS Titanic. Description:; Dataset describing the survival status of individual passengers on the Titanic. Mar 5, 2022 · More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects. 38% or just over one third of our passengers survived; The average age of the passengers was almost 30 years old and the median was 28 years old This repository is dedicated to performing a thorough exploratory data analysis of the Titanic passenger dataset. Age: Age of the passenger. Leveraging the Titanic dataset, learn the ins and outs of transitioning from research to production. The dataset encompasses information on 891 Titanic passengers, including details such as passenger ID, survival status, class, age, gender, fare, cabin, and embarkation port. - vishnudev-p The goal of this project is to predict whether a passenger survived the Titanic disaster (binary classification). One of the key insights from the Titanic data set is the strong relationship between a passenger's socioeconomic status, as measured by their ticket class, and their chances of survival. titanic_dataset. Key features include age, gender, class, and fare. To associate your repository with the titanic-dataset titanic full dataset. 7. The project focuses on understanding the factors that influenced the survival rates of passengers aboard the Titanic. Read the Dataset: Load the Titanic dataset from a CSV file for analysis. The objective is to explore relationships between variables and identify patterns and trends in the data. crosstab(titanic. csv will contain the details of a subset of the passengers on board (891 to be exact) and importantly, will reveal whether they survived or not, also known as the “ground truth”. Contribute to datasciencedojo/datasets development by creating an account on GitHub. g. We read every piece of feedback, and take your input very seriously. An analysis of titanic dataset from Kaggle using Python pandas and mathplotlib. KNN and SVM algorithms with Stratified K-Fold Cross-Validation to predict passenger survival in the Titanic dataset. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects. Titanic dataset, Data prep-rocessing, Machine learning. The dataset used in this project is the Titanic dataset, containing the following columns: Survived: Survival (0 = No, 1 = Yes) Pclass: Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd) Sex: Gender; Age: Age in years; SibSp: Number of siblings/spouses aboard the Titanic; Parch: Number of parents/children aboard the Titanic; Fare: Passenger fare Aug 29, 2014 · The unfortunate event which was occurred on 15 April 1912, the Titanic sank after colliding with an iceberg, aboard 2224 peoples. Jun 26, 2024 · The dataset used in this project is sourced from Kaggle and is available in the repository. Additional files such as data sets, scripts, etc. csv that contains the details of a subset of the passengers on board and importantly, will reveal whether they survived or not, and test. This repository contains the analysis and visualization of the Titanic dataset. Sex,titanic. The analysis involves data preprocessing, visualization, and statistical evaluation using Python libraries like Pandas, Matplotlib, and Seaborn. pbix: A PowerBI dashboard visualizing the Titanic dataset. The dataset used in this project contains information about Titanic passengers, such as their age, gender, passenger class, and other relevant features. titanic_dataset_powerbi_dashboard. 5. The titanic. It employs classification algorithms like Logistic Regression, SVM, Decision Tree, Random Forest, and KNN, trained on the Titanic dataset. It includes the following columns: 1. SibSp: Number of siblings Analyzed the Titanic dataset using Python for data cleaning, exploratory data analysis, and visualization. - f-a-tonmoy/Titanic-EDA Oct 8, 2023 · The Titanic dataset is a classic playground for data scientists . data/: Includes the Titanic dataset files (train. The dataset comprises of 891 observations of 12 columns. The first 5 rows of the dataset are displayed to give an initial overview. This task falls under the umbrella of classification in data science, where we aim to assign each passenger to one of two classes: survived or did not survive The sinking of the Titanic is one of the most infamous shipwrecks in history. An in-depth analysis of the Titanic dataset, exploring passenger demographics, survival rates, and other key metrics using Python. It’s your job to predict these outcomes. Contribute to johnreygoh/datasets development by creating an account on GitHub. 97% on test data set. By exploring relationships between variables such as age, gender, passenger class, and fare, I aim to understand how these factors impacted survival rates. Logistics Regression | Titanic Dataset. This project provides an opportunity to explore key machine learning concepts and gain insights from historical data. This repository contains code for data acquisition, preprocessing, visualization, and a detailed exploration of patterns and insights related to the tragic sinking of the RMS Titanic. The main objective of this project is to build a Machine Learning pipeline that processes the Titanic dataset, engineers relevant features, and predicts which passengers are most likely to survive. These resources are meant to provide a comprehensive and interactive way to analyze and understand the The following recommendations are made to prevent future loss: Due to possibility of implementation of women and children first rule while male comes last thereby leading to great loss of life,I recommend that it should be by checking some factors such as medical needs ,ability to rescue oneself or others and physical fitness. It involves predicting passenger survival on the ill-fated ship based on various features . - niharikakuchhal titanic is an R package containing data sets providing information on the fate of passengers on the fatal maiden voyage of the ocean liner "Titanic", with variables such as economic status (class), sex, age and survival. Includes the definition of questions to be answered, detailed description of the exploratory steps, and communication of conclusions. Unfortunately, there weren’t enough lifeboats for everyone on board, resulting in the A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Welcome to the Titanic Dataset Dashboard - PowerBI repository! Here, you will find files related to the famous Titanic dataset, including a machine learning notebook and a PowerBI dashboard. We also include gender_submission. These resources are meant to provide a comprehensive and interactive way to analyze and understand the The main objective of this project is to build a Machine Learning pipeline that processes the Titanic dataset, engineers relevant features, and predicts which passengers are most likely to survive. To associate your repository with the titanic-dataset A walk-through of data science basics using PySpark, MLflow and the Titanic dataset - bensadeghi/Databricks-DataScience-Titanic This repository contains an analysis of the Titanic dataset, which provides information about passengers aboard the RMS Titanic during its tragic maiden voyage. csv file contains data for 887 of the real Titanic passengers. The project involves data cleaning, exploration, visualization, and statistical analysis to gain insights into survival rates, demographic patterns, and relationships between various features of the passengers. Using Python and various data science libraries, the analysis encompasses data cleaning, exploratory data analysis (EDA), feature engineering, and predictive modeling. Using Naïve Bayes classifier, we try to predict which passengers survived the Titanic shipwreck. Name: Name of the passenger. The goal is relatively simple: Build a predictive model that is able to predict which passengers survived the Titanic disaster based on their passenger data. Missing values in the original dataset are represented using ?. To review, open the file in an editor that reveals hidden Unicode characters. Missing values in each column are checked and displayed. Number of female embarded from Cherbourg are 50-100, Number of female embarded from Queenstown are less than 50 Number of female embarded from Southampton are 200 It contains all the facts, history, and data surrounding the Titanic, including a full list of passengers and crew members. The `test. The analysis focuses on identifying the factors that influenced the survival rates of passengers, utilizing Python and its powerful libraries. - RichieRk/Titanic_Dataset Welcome to the Titanic Dataset - Exploratory Data Analysis (EDA) project repository! This project aims to uncover insights from the Titanic dataset using Python and Jupyter Notebook. csv data, predict whether the other 418 passengers on board (found in test. The project encapsulates the complete More than 83 million people use GitHub to discover, fork, and contribute to over 200 million projects. Solution to Kaggle's Titanic Dataset using various ML algorithms The goal is to predict the survival or the death of a given passenger based on 12 feature such as sex, age, etc. The dataset includes information about passengers' demographics and whether they survived or not. For each column, and potentially within each group in a MAR scenario, you replace missing values with either a constant value or some column statistic on the available feature values (e. This is a binary classification to detect the survival or death of a passenger onboard the Titanic. Libraries used for Data Visulation: Matplotlib,Seaborn. Analyze the Titanic dataset and fit a logistic regression model to predict passenger survival. You signed out in another tab or window. It includes steps such as data cleaning, exploratory data analysis, visualizations, and model building to predict survival outcomes based on various factors. titanic_model_predictor. Analysis of Titanic full Dataset. csv, test. This project aims For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic. In this project, a powerful K-Nearest Neighbors (KNN) and Support Vector Machine (SVM) algorithms are leveraged to predict passenger survival in the Titanic dataset. Contribute to spyder10/Titanic-Dataset development by creating an account on GitHub. requirements. Titanic passenger Data Analysis consist: Data Exploration and Preparation, Data Representation and Transformation, Data Visualization and Presentation - titanic-dataset/README. 3. More than 100 million people use GitHub to discover, fork, and contribute to over 330 million projects. csv that contains similar information but does not disclose Survived information for each passenger. Embarked) ax=survive_sex. 1. The goal is to do machine learning to predict whether a passenger will die or live, considering his data: is the passenger travelling in third class? This project involves performing data cleaning and exploratory data analysis (EDA) on the Titanic dataset from Kaggle. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. md at master · rajrohan/titanic-dataset More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects. Libraries used for Preprocessing: Pyforest,Numpy,Pandas. " Data Exploration. kgy gnot atkc sqetvvm jdstew rwmv kjwy acy bhz kwecp