this post was submitted on 22 Sep 2024
705 points (98.4% liked)

Technology

59599 readers
3455 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

cross-posted from: https://discuss.tchncs.de/post/22423685

EDIT: For those who are too lazy to click the link, this is what it says

Hello,

Sad news for everyone. YouTube/Google has patched the latest workaround that we had in order to restore the video playback functionality.

Right now we have no other solutions/fixes. You may be able to get Invidious working on residential IP addresses (like at home) but on datacenter IP addresses Invidious won't work anymore.

If you are interested to install Invidious at home, we remind you that we have a guide for that here: https://docs.invidious.io/installation/..

This is not the death of this project. We will still try to find new solutions, but this might take time, months probably.

I have updated the public instance list in order to reflect on the working public instances: https://instances.invidious.io. Please don't abuse them since the number is really low.

Feel free to discuss this politely on Matrix or IRC.

you are viewing a single comment's thread
view the rest of the comments
[–] Gutless2615@ttrpg.network 4 points 2 months ago (1 children)

“Give me a Python script using yt-dlp that I can run on a cronjob that will download the videos from any of my subscribed channels since the last time the script was run”

You can use the following Python script to download videos from your subscribed channels since the last run. This script uses yt-dlp and stores the last download timestamp to track new videos.

First, ensure you have yt-dlp installed:

pip install yt-dlp

Then, create a script called download_videos.py:

import os
import json
import subprocess
from datetime import datetime

# Configuration
last_run_file = 'last_run.json'
download_directory = 'downloads'

# Ensure the download directory exists
os.makedirs(download_directory, exist_ok=True)

# Load the last run time
if os.path.exists(last_run_file):
    with open(last_run_file, 'r') as f:
        last_run = json.load(f)['last_run']
else:
    last_run = datetime.now().isoformat()

# Update the last run time to now
current_run = datetime.now().isoformat()

# Command to get videos from subscribed channels since the last run
command = [
    'yt-dlp',
    '--download-archive', 'archive.txt',
    '--output', f'{download_directory}/%(title)s.%(ext)s',
    '--date-after', last_run,
    '--no-post-overwrites',
    '--merge-output-format', 'mp4',
    'https://www.youtube.com/channel/CHANNEL_ID',  # Replace with your channel URL
]

# Run the command
subprocess.run(command)

# Save the current run time
with open(last_run_file, 'w') as f:
    json.dump({'last_run': current_run}, f)

print("Download complete. Next run will check for videos since:", current_run)

Setting Up the Cron Job

  1. Make the script executable:

    chmod +x download_videos.py
    
  2. Open your crontab:

    crontab -e
    
  3. Add a line to run the script at your desired interval (e.g., daily at 2 AM):

    0 2 * * * /path/to/python /path/to/download_videos.py
    

Notes

  • Replace CHANNEL_ID in the script with your actual channel IDs or use a playlist URL if preferred.
  • The archive.txt file keeps track of already downloaded videos to avoid duplicates.
  • Adjust the paths to Python and your script as needed.
[–] webghost0101@sopuli.xyz 2 points 2 months ago* (last edited 2 months ago) (1 children)

Another example, which i can personally verify has been working fine for months. It works a bit different to the above, it downloads the latests 2* vids that are not already downloaded and runs once every hour with cron. I also attempted to filter out live vids and shorts.

Channels i am "subscribed" too are stored in a single text file, it also uses the avc1 codec because i found p9 and p10 had issues with the jellyfin client on my tv.

looks like this, i added categories but i don't actually use them in the script besides putting them in a variable, lol. Vid-limit is how many of the latests vids it should look at to download. The original reason i implemented that is so i could selectively download a bulk of latests vids if i wanted to.

Cat=Science
Name=Vertitasium
VidLimit=2
URL=https://www.youtube.com/channel/UCHnyfMqiRRG1u-2MsSQLbXA

Cat=Minecraft
Name=EthosLab
VidLimit=2
URL=https://www.youtube.com/channel/UCFKDEp9si4RmHFWJW1vYsMA
#!/bin/bash


# Define the directory to store channel lists and scripts
script_dir="/.../YTDL"

# Define the base directory to store downloaded videos
base_download_dir="/.../youtubevids"

# Change to the script directory
cd "$script_dir"

# Parse the Channels.txt file and process each channel
awk -F'=' '
  /^Cat/ {Cat=$2}
  /^Name/ {Name=$2}
  /^VidLimit/ {VidLimit=$2}
  /^URL/ {URL=$2; print Cat, Name, VidLimit, URL}
' "$script_dir/Channels.txt" | while read -r Cat Name VidLimit URL; do
    # Define the download directory for this channel
    download_dir="$base_download_dir"
    
    # Define the download archive file for this channel
    archive_file="$script_dir/DLarchive$Name.txt"
    
    # Create the download directory if it does not exist
    mkdir -p "$download_dir"
    
    # If VidLimit is "ALL", set playlist_end option to empty, otherwise set it to --playlist-end <VidLimit>
    playlist_end_option=""
    if [[ $VidLimit != "ALL" ]]; then
        playlist_end_option="--playlist-end $VidLimit"
    fi
yt-dlp \
        --download-archive "$archive_file" \
        $playlist_end_option \
        --write-description \
        --write-thumbnail \
        --convert-thumbnails jpg \
        --add-metadata \
        --embed-thumbnail \
        --match-filter "!is_live & !was_live & original_url!*=/shorts/" \
        --merge-output-format mp4 \
        --format "bestvideo[vcodec^=avc1]+bestaudio[ext=m4a]/best[ext=mp4]/best" \
        --output "$download_dir/${Name} - %(title)s.%(ext)s" \
        "$URL"
        
done
[–] Gutless2615@ttrpg.network 1 points 2 months ago (1 children)

Yeah this is more elegant and closer to what I’d actually want to implement. I was more just showing what could be done in literally thirty seconds on the can with ChatGPT.

[–] webghost0101@sopuli.xyz 1 points 2 months ago

I knew i recognized that output.

Mine is actually also made with the help of Chatgpt but manually refined and tested.