this post was submitted on 28 Apr 2025
41 points (97.7% liked)
LocalLLaMA
2939 readers
15 users here now
Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.
Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.
As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Qwen3-30B-A3B-Q6_K:
Amt: 8003, Process:10.67s (17.52T/s), Generate:1408.00s (5.68T/s), Total: 1418.67s
Qwen3-30B-A3B-Q6_K - Thoughts (29081 characters)
Okay, I need to write a Python script that sorts all .gguf files in the current directory into subdirectories based on their publisher. The publishers are determined from URLs stored in files named dl{YYYYMMDD}.txt. Each line in these files has a URL, and some might be commented out with a #. The publisher is the first part after the domain in the URL. Let me think through how to approach this.First, I need to find all the .gguf files in the current directory. Then, for each of them, I need to determine their publisher. But how do I get the URLs associated with each .gguf file? Wait, the problem says that the URLs are in the dl{YYYYMMDD}.txt files. But how do I know which URL corresponds to which .gguf file? Hmm, maybe the URLs in the txt files are the ones that were used to download the .gguf files. So each .gguf file was downloaded via a URL listed in one of these txt files. But how do I associate each .gguf file with its URL?
Wait, the problem statement says that the URLs are in the files named dl{YYYYMMDD}.txt. But the .gguf files themselves don't have any metadata linking them to the URLs. So maybe the assumption is that each .gguf file was downloaded using one of the URLs in these txt files. But how to determine which URL corresponds to which .gguf file? That's a problem. Unless there's some naming convention I'm missing.
Wait, maybe the URLs in the txt files are the ones that were used to download the .gguf files. For example, if a URL is "https://huggingface.co/unsloth/.../file.gguf", then the corresponding .gguf file would be named something like "file.gguf". But the problem says that the .gguf files are in the current directory. So perhaps the script needs to look through all the URLs in the txt files, extract the publisher, and then for each .gguf file, find which URL it came from. But without any direct link, how?
Wait, maybe the .gguf files are named after the file part of the URL. For example, the URL ends with "Qwen3-30B-A3B-Q6_K.gguf?download=true", so the .gguf file would be named "Qwen3-30B-A3B-Q6_K.gguf". If that's the case, then for each .gguf file, we can check if there's a URL in the txt files that has the same filename. But that might not be reliable. Alternatively, maybe the script is supposed to process all the URLs in the txt files, extract their publishers, and then for each .gguf file, determine which publisher it belongs to based on the URL that was used to download it. However, without knowing which URL corresponds to which .gguf file, this seems impossible.
Wait, maybe the problem is that the user expects that each .gguf file was downloaded via one of the URLs in the txt files. So the script should process all the URLs in the txt files, extract the publisher, and then for each .gguf file, find which URL (if any) it corresponds to. But how?
Alternatively, perhaps the script is supposed to take each .gguf file, and for each of them, look through all the URLs in the txt files to find which one matches. But how? For example, if the .gguf file is named "Qwen3-30B-A3B-Q6_K.gguf", then the URL would have that as the filename. So perhaps the script can check for each .gguf file, whether there's a URL in the txt files that has that filename. But that's a bit of a stretch. However, given the problem statement, perhaps that's the intended approach.
Alternatively, maybe the problem is that the URLs in the txt files are the ones that were used to download the .gguf files, and each .gguf file's name is derived from the URL. So the script can process all the URLs, extract the publisher, and then for each .gguf file, determine which URL it came from by matching the filename. But this requires that the .gguf files are named exactly as the file part of the URL.
Assuming that's the case, here's the plan:
But how to handle cases where multiple URLs have the same filename? Or if no URL matches? The problem statement says that the URLs are in the txt files, but maybe not all .gguf files are accounted for. But the script should process as many as possible.
Alternatively, perhaps the script is supposed to process all URLs in the txt files, extract the publisher, and then create directories for each publisher, but how does that relate to the .gguf files? Maybe the .gguf files are already in the directory, and the script is supposed to move them into the publisher's directory based on the URLs in the txt files. But without knowing which .gguf file corresponds to which URL, this seems impossible.
Wait, maybe the problem is that the .gguf files were downloaded using the URLs in the txt files, and the script needs to associate each .gguf file with the URL that was used to download it. But how? Unless there's some other metadata. But the problem statement doesn't mention that.
This is a bit confusing. Let me re-read the problem statement.
The user says: "The URLs are in files in the current directory named in the format
dl{YYYYMMDD}.txt
, with one url per line. Some url lines may be commented out with a single#
character, but they are still a valid source of info. An example url is ... and the publisher is always the first node after the domain, in this caseunsloth
."So the URLs are stored in th [...trimmed because lemmy won't handle >10,000 chars]
Qwen3-30B-A3B-Q6_K - Answer
Here's a well-documented Python script that fulfills the requirements of sorting.gguf
files into directories based on their source publishers, as extracted from URLs stored indl{YYYYMMDD}.txt
files.โ What This Script Does
.gguf
files in the current working directory.dl{YYYYMMDD}.txt
files in the current directory.#
) but still considered valid..gguf
file into the corresponding publisher directory.๐ Sample Input (from
dl20231010.txt
)๐ฆ Python Script
๐ Notes
dl{YYYYMMDD}.txt
..gguf
files are named exactly as per the URL's file part (e.g.,Qwen3-30B-A3B-Q6_K.gguf
)..gguf
file is referenced by multiple URLs, it will be moved only once (after that, it will no longer exist in the current directory).shutil.move
, which is safe and works across file systems.๐งช Example Output
This script is designed to be robust and user-friendly, ensuring all
.gguf
files are properly organized based on their source publishers.