Split train/test data




    The code creates a sample of the dataframe df. It sets the fraction to 0.8, which means the sample will include 80% of the data. The data is then randomly divided into two sets, the train set and the test set. The train set is removed from the dataframe, leaving only the test set. The test set is then populated with the labels from the train set.

    import pandas as pd
    train_set = df.sample(frac=0.8, random_state=0)
    test_set = df.drop(train_set.index)
    train_labels = train_set.pop('label')
    test_labels = test_set.pop('label')
    Codiga Logo
    Codiga Hub
    • Rulesets
    • Playground
    • Snippets
    • Cookbooks
    soc-2 icon

    We are SOC-2 Compliance Certified

    G2 high performer medal

    Codiga – All rights reserved 2022.