thiscodeWorks - Organizing the best of code online

List view

#python #datasets #split #train #validation

Split CSV into Train and Validation datasets (85%/15%)

import pandas as pd
from sklearn.model_selection import train_test_split

# Load the data from the CSV file
data = pd.read_csv("data.csv")

# Split the data into train and validation sets, using 85% of the data for training and 15% for validation for the "labels" column
train_data, validation_data = train_test_split(data, train_size=0.85, test_size=0.15, random_state=42, stratify=data["labels"])

# Write the train and validation datasets to CSV files
train_data.to_csv("train.csv", index=False)
validation_data.to_csv("valid.csv", index=False)

#powerapps #split #text #trim

Sets a variable with spilt text from a link - YouTube in this case

Set(varSplitText, (Concat(LastN(Split(TextBoxNew.Text, "=").Result, CountRows(Split("https://www.youtube.com/watch?v=", "=").Result) -1),Result
)));

#split #sql #large-sql

SIMPLE WAY SPLIT LARGE SQL FILE

### MAKE DIR
mkdir split && cd split

### SPLIT FILE
split -l 1000 /<path>/output-google.sql /<path>/split/split-

### CHANGE EXTENTION
ls | xargs -I % mv % %.sql

Split CSV into Train and Validation datasets (85%/15%)

Sets a variable with spilt text from a link - YouTube in this case

SIMPLE WAY SPLIT LARGE SQL FILE

Save snippets that work with our extensions