[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Incorrect handling of Cyrillic characters in http request
From: |
Random Coder |
Subject: |
Re: [Bug-wget] Incorrect handling of Cyrillic characters in http request - any workaround? |
Date: |
Tue, 31 Mar 2015 15:05:19 -0700 |
On Tue, Mar 31, 2015 at 10:11 AM, Stephen Wells <address@hidden> wrote:
> Dear all - I am currently trying to use wget to obtain mp3 files from the
> Google Translate TTS system. In principle this can be done using:
>
> wget -U Mozilla -O "${string}.mp3" "
> http://translate.google.com/translate_tts?tl=TL&q=${string}"
>
> ...
>
> http://translate.google.com/translate_tts?tl=ru&q=%D0%BC%D0%B0%D0%B7%D0%B0%D1%82%D1%8C
>
> This of course produces a string of gibberish in the resulting mp3 file!
That URL is correct, it's what you'll see a browser send across the
wire for the same string. Google is producing gibberish because of
some User-agent sniffing that they appear to be doing.
If you change the user agent to something that's more complete, like
"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/41.0.2228.0 Safari/537.36" instead of just Mozilla, it should
work correctly.