this post was submitted on 12 Nov 2024
52 points (100.0% liked)

Technology

37759 readers
330 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
 

A Dutch publisher has announced that it will use AI to translate some of its books – but those in the industry are worried about the consequences if this becomes the norm.

and so it begins...

you are viewing a single comment's thread
view the rest of the comments
[–] HK65@sopuli.xyz 4 points 3 weeks ago (1 children)

So as a counterpoint to all the comments here, I absolutely see this working. I needed to translate a fairly long work of fiction, and an LLM made my work 10x as fast, since quite obviously my active vocabulary between the two languages differed.

It was much easier and faster to correct the LLM than to write the translation myself. Imagine this replacing workers not like 1 workplace becomes 1 LLM subscription, but more like 10 workplaces become 2 workplaces and an LLM subscription.

[–] DdCno1@beehaw.org 7 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

Did you inform your readers that most of the translation was done by the LLM?

[–] HK65@sopuli.xyz 8 points 3 weeks ago (1 children)

That's a very good question.

  • Yes, I have.
  • It was not professional work but a private request from a loved one.
  • It was actually their idea.
  • And I was very, very sceptical about it at the idea at first and the output all throughout the process.

I have made extensive edits to the original LLM translation, as it got a lot of things wrong. To be honest, it got a lot of the stuff that is unique to the book and that made the book special wrong, both in words, or intent, and I had to correct it. My workflow was literally putting it in the prompt, taking the output, then putting the two texts next to each other and deciding, sentence by sentence, word by word:

  • Is the translation any good? (around 95% was generally good, sometimes it trailed off, and I needed to find the point at which it started bullshitting)
  • Does it use terms that are unique in the book consistently the right way (it almost never did, I literally had a dictionary of the most frequent mistakes)
  • Could I have done it better? Do I know a way to better convey the intent? (this happened quite rarely, as it has done a near word-for-word translation, the biggest problems were idioms that made sense in one language but didn't in another, or misgendered characters)

All in all, I think the LLM did the heavy lifting in remembering all the odd words and grammar, and it gave me a very flawed first draft. It was 80% of the time, but like 5% of the actual creative work that goes into a translation.

I spent 90% of my time outside the LLM, in my text editor.

[–] DdCno1@beehaw.org 2 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

That's a very good answer.

If I'm getting this right, this was a novel that you perhaps mentioned to your loved one, but a language barrier prevented them from reading it. They then suggested the use of an LLM to translate it, which you used as foundation to build upon. If I may ask, which story did you translate (it has to be good if you spent this much work on it) and which LLM did you use?

I can't see anything wrong with this. I've used this kind of approach using all sorts of machine translation tools going back over 20 years (not for entire books though). Let the computer do its thing, then fix mistakes - but this was always noncommercial, private use for myself, friends and relatives, as well as the occasional friendly online community. Although, I've also done entirely manual work, with no machine translation at all in situations when I wanted the best possible quality or where complexity and nuance made anything else impossible - like with a long list of "whisper jokes" from Nazi Germany, subversive jokes that people told each other under the punishment of death that require a ton of context no translation tool could possibly have.

The point here is though that this is very different from a publisher doing this commercially - and you and I both know that these companies will not even allow for the bare minimum of time spent fixing mistakes made by the translation tools.

[–] HK65@sopuli.xyz 3 points 3 weeks ago

No, the loved one was actually the author, it's a children's book actually, light fiction, think early Harry Potter for example.

It's a self-published hobby project, with a few dozen copies sold in the original language since there are relatively few speakers and light novels for kids are unfortunately a very small niche everywhere, and we didn't really market it either since earning money wasn't really the goal. The reason I'm mentioning that it was not professional work is that I'm not misrepresenting the amount of work done to someone paying me, and I'm actually interested in preserving the qualities of the original, I really don't want to make more LLM slop, and I especially don't want to make LLM slop out of something that has meaning to me personally. I've put at least a few hundred hours of manual work into it to make sure it isn't.

But the idea is indeed to self-publish it and sell a few copies to people who are interested. It's not about the income (the author actually has a regular job and is freelancing in 2 others, this is literally just a hobby), it's more about the feeling of having made something that made other people interested enough to pay five bucks for it.

Responding to the other topic, one interesting thing about the translation that I've found out (and mistranslations from the LLM actually helped spark this idea), is if you can somehow convey the context to the reader, it can make it fresh and interesting and something they haven't read before, and that's true not just about idioms, but other cultural patterns as well.

Think how the world and themes of Witcher was something refreshing and new for most international audiences, while in its home country it was very recognizable where the author got his material from.