Remove data from Datasets

ℹ️

Note

While it's possible to delete data from a Dataset, entire Datasets can't be deleted using the SDK. Use the Encord platform to delete Dataset.

Use the dataset.delete_data() method to delete from Datasets.

In the script below, replace <video1_data_hash> and <image_group1_data_hash> with the hashes for the data units you want to remove from a Dataset. If the data unit being removed is saved on Encord-hosted storage, the file will be deleted.

# Import dependencies
from encord import EncordUserClient

# Authenticate with Encord using the path to your private key. Replace <private_key_path> with the path to your private key
user_client = EncordUserClient.create_with_ssh_private_key(
    ssh_private_key_path="<private_key_path>"
    )

# Specify the Dataset you want to remove files from. Replace <dataset_hash> with the hash of your Dataset
dataset = user_client.get_dataset(
    "<dataset_hash>"
    )

# Specify the files to be deleted. Include all the data hashes below
dataset.delete_data(
    [
        "<video1_data_hash>",
        "<image_group1_data_hash>",
    ]
)