Half-Life и Adrenaline Gamer форум

Всё об игре в Халф-Лайф и АГ
Текущее время: 29 мар 2024, 01:38

Часовой пояс: UTC + 5 часов [ Летнее время ]




Начать новую тему Ответить на тему  [ Сообщений: 21 ]  На страницу 1, 2, 3  След.
Автор Сообщение
 Заголовок сообщения: UTF-8 in Chat check letters
СообщениеДобавлено: 11 май 2023, 16:12 
Не в сети
Аватара пользователя
Зарегистрирован:
22 окт 2014, 19:26
Последнее посещение:
20 мар 2024, 19:47
Сообщения: 1018
Hi, first of all Welcome back to the forum again :D
strlen()Not supports multi-byte characters (UTF-8 ). (is there another function similar supports multi-byte?)
I want an example of how strfind() can determine the position of a letter in a word?
I want to replace the word except for the last letter, for example, how do I do that?

_________________
https://vk.com/kgbaghl


Вернуться к началу
 Профиль 
  
 Заголовок сообщения: Re: UTF-8 in Chat check letters
СообщениеДобавлено: 15 май 2023, 12:27 
Не в сети
Site Admin
Зарегистрирован:
01 июн 2010, 01:27
Последнее посещение:
26 мар 2024, 21:42
Сообщения: 6864
Hi!
https://www.amxmodx.org/api/string/__functions
No words about no support of UTF-8 for strlen.
In real, strlen works for UTF-8 strings. It returns length in bytes (not chars).
To find, just use strfind with the same multi-byte string you are trying to find. It will search for byte to byte match.


Вернуться к началу
 Профиль 
  
 Заголовок сообщения: Re: UTF-8 in Chat check letters
СообщениеДобавлено: 16 май 2023, 02:04 
Не в сети
Аватара пользователя
Зарегистрирован:
22 окт 2014, 19:26
Последнее посещение:
20 мар 2024, 19:47
Сообщения: 1018
-How can I replace all letters in each word in the chat except for the last letter of each word Using the available functions?
-I specified the last letter through this function but it only works with latin (english) letters
But it didn't give the result I'm looking for in the end
 

this my code and whati try too:
Код:
new Separate_letters_Symbol[][] = {
   "ط§", "ط¨", "طھ", "ط«", "ط¬", "ط­", "ط®", "ط³", "ط´", "طµ", "ط¶", "ط¹",
   "ط؛", "ظپ", "ظ‚", "ظƒ", "ظ…", "ظ†", "ظٹ", "ط©", "ظ‰", "ظ‡", "ظ„",
   "ط¦"
}

new Connected_letters_Symbol[][] = {
   
   "ï؛ژ", "ï؛‘", "ï؛—", "ï؛›", "ï؛ں", "ï؛£", "ï؛§", "ï؛³", "ï؛·", "ï؛»", "ï؛؟â€ژ", "ﻋ",
   "ï»ڈ", "ﻓ", "ï»—", "ï»›", "ﻣ", "ﻧ", "ﻳ", "ï؛”", "ï»°", "ï»ھ", "ï»ں",
   "ï؛‹"
}

public plugin_init() {
   register_plugin( PLUGIN, VERSION, AUTHOR )
   
   register_clcmd( "say", "CheckMessage" )
   register_clcmd( "say_team", "CheckMessage" )
}


public CheckMessage(id) {
   static said[192], said_to_utf16[192], said_to_utf8[192], name[33];
   
   read_args( said, charsmax(said) )
   remove_quotes( said )
   trim( said )
   
   MultiByteToWideChar(said, said_to_utf16)

   if(isArabic(said_to_utf16)) {
   
      ReverseString(said_to_utf16)
   }
   
   WideCharToMultiByte(said_to_utf16, said_to_utf8)
   
   for( new i; i < sizeof Separate_letters_Symbol; i++ ) {
      
      new len = strlen(said_to_utf8)
      
      for (new i = 0; i < len; i++) {
         
         if (i != len - 1) {

            replace_all(said_to_utf8, charsmax(said_to_utf8), Separate_letters_Symbol[i], Connected_letters_Symbol[i]);
         }
      }
   }
   get_user_name(id, name, charsmax(name));
   
   client_print(0, print_chat, " (AR) %s : %s",name, said_to_utf8)

   return PLUGIN_HANDLED
}

_________________
https://vk.com/kgbaghl


Вернуться к началу
 Профиль 
  
 Заголовок сообщения: Re: UTF-8 in Chat check letters
СообщениеДобавлено: 24 май 2023, 19:08 
Не в сети
Site Admin
Зарегистрирован:
01 июн 2010, 01:27
Последнее посещение:
26 мар 2024, 21:42
Сообщения: 6864
Why do you use i variable name for outer and inner for cycles? It is a mistake.
I dunno what you are trying to do. Nor I understand why you use MultiByteToWideChar, WideCharToMultiByte, ReverseString.
Probably this is about right-to-left writing, but I dunno how people chat in these languages.
Without correct description I can't help.


Вернуться к началу
 Профиль 
  
 Заголовок сообщения: Re: UTF-8 in Chat check letters
СообщениеДобавлено: 26 май 2023, 20:33 
Не в сети
Аватара пользователя
Зарегистрирован:
22 окт 2014, 19:26
Последнее посещение:
20 мар 2024, 19:47
Сообщения: 1018
The Arabic language is one of the languages written from right to left (words and letters), but inside the game it is not written correctly
For example: "(سلام عليكم) = (slam alaykum)" is written like this in the game "(م ك ي ل ع _ م ا ل س) = ( m l a s _ m u k y a l a )" ( i.e. in the form of separate letters and from left to right in an incomprehensible way.
The letters in the Arabic language are separated when written alone or in when they are at the end of the word, such as the letters of the alphabet: "(أ ب ت ث ...etc)= (a b c d ... etc)" and at the end of the word, for example: "(سلام) = (slam)" The letter "(م) = (M)" is separated and its form is at the beginning or middle of the word It is connected like this: (مــ)
What I'm trying to do in the plugin is I'm going to replace the separate characters with the cursive characters and I'm flipping those characters from right to left in order for them to be intelligible and to do that I have to convert the string to UTF-16 and then reverse it and then convert it back to UTF-8 that's what they advised me, have had satisfactory results so far
The problem remained in checking other matters and fixing them, including the last letter that should take the form of a separate letter, and what I thought about was replacing the letters in the word and ignoring the replacement of the last letter for it. I tried with it, but I do not find the appropriate way to do so.

Posted after 15 minutes 47 seconds:
Lev писал(а):
Why do you use i variable name for outer and inner for cycles? It is a mistake.
Yes, sorry, I had sent the wrong code. I used "j" inside the loop, but now I dispensed with all that and extracted the word without the last letter, as well as extracting the last letter of the word alone and then collecting them, but it is a silly method where the result is that there is a space between them, and I could not generalize it to all words in a sentence (i.e. every word in a sentence must have its last letter separated) it's complex :pardon: and that's the whole code I'm using :
 

Results:


Вложения:
Capture.PNG
Capture.PNG [ 109.32 КБ | Просмотров: 1074 ]

_________________
https://vk.com/kgbaghl
Вернуться к началу
 Профиль 
  
 Заголовок сообщения: Re: UTF-8 in Chat check letters
СообщениеДобавлено: 28 май 2023, 15:26 
Не в сети
Site Admin
Зарегистрирован:
01 июн 2010, 01:27
Последнее посещение:
26 мар 2024, 21:42
Сообщения: 6864
The first thing you should try to achive is to output text to the client screen so it looks as required. For that I advise you to move to the byte level. If you see that at least part of the sentence looks correctly - capture it and analyze the byte order. Check the end letter case, probably you will be able to add some bytes (text, spaces, dots) so the last letter will appear not separated.
After you will get the output byte sequence that will looks good, you can start to deal with the input byte sequence to convert it in the correct form.


Вернуться к началу
 Профиль 
  
 Заголовок сообщения: Re: UTF-8 in Chat check letters
СообщениеДобавлено: 08 июн 2023, 00:20 
Не в сети
Аватара пользователя
Зарегистрирован:
22 окт 2014, 19:26
Последнее посещение:
20 мар 2024, 19:47
Сообщения: 1018
Lev писал(а):
The first thing you should try to achive is to output text to the client screen so it looks as required.
Код:
      MultiByteToWideChar(said, said_to_utf16)

   if(isArabic(said_to_utf16)) {
   
      ReverseString(said_to_utf16)
   
      WideCharToMultiByte(said_to_utf16, said_to_utf8)
     
      for( new i; i < sizeof Separate_letters_Symbol ; i++ ) {
         
         replace_all(said_to_utf8, charsmax(said_to_utf8), Separate_letters_Symbol[i], Connected_letters_Symbol[i]);
      }
In this part of the above code I actually captured the chat line as I should see it (said_to_utf8)
(the position of the letters in the word and the position of the words were converted from right to left, then the non-connected letters were replaced and converted to connected letters in the word)
Lev писал(а):
For that I advise you to move to the byte level. If you see that at least part of the sentence looks correctly - capture it and analyze the byte order. Check the end letter case, probably you will be able to add some bytes (text, spaces, dots) so the last letter will appear not separated.
Код:
      new last_leter[192]
     
      getLastChar_UTF8(last_leter, charsmax(last_leter), said_to_utf8)
     
      for( new i; i < sizeof Connected_letters_Symbol ; i++ ) {
         
         replace_all(last_leter, charsmax(last_leter), Connected_letters_Symbol[i], Separate_letters_Symbol[i]);
      }
Yes, in this part of the code, I think I did that by capturing the last letter of the word in the chat through its last output in the previous code, then I returned the last letter of a connected letter to a separate letter (contrary to what I did in the previous code)

Код:
      new said_utf8[192];//First Letters
     
      getWord_UTF8(said_utf8, charsmax(said_utf8), said_to_utf8)
     
      new result[192];
      formatex(result, charsmax(result), "%s%s", last_leter, said_utf8);
Here I captured the word except for the last letter and then output them in one interface in the results (
i.e. the last letter that was previously replaced with the word without the last letter)
The results are somewhat unsatisfactory because there is a space between the last letter and the word.
Result what i get : م(space)سلا "As attached in the picture"
Result what i need : سلام "like as defult output (UP)result but form of last letter should be (م) not (مـ) "
I am wondering if there is another way to analyze the letters inside each word and ignore the replacement of the last letter, i.e. leave it as it is (i.e. replace the letters of words without replacing the last letter of it)
Lev писал(а):
After you will get the output byte sequence that will looks good, you can start to deal with the input byte sequence to convert it in the correct form.
Could you give me a simple example to make it clearer?

_________________
https://vk.com/kgbaghl


Вернуться к началу
 Профиль 
  
 Заголовок сообщения: Re: UTF-8 in Chat check letters
СообщениеДобавлено: 08 июн 2023, 15:27 
Не в сети
Site Admin
Зарегистрирован:
01 июн 2010, 01:27
Последнее посещение:
26 мар 2024, 21:42
Сообщения: 6864
abdobiskra писал(а):
Could you give me a simple example to make it clearer?
You can start with just single-line plugin:
Код:
client_print( "Your text in right-to-left-form" )
And check how it looks on the client. If you will manage to output text correctly, bring that string from the plugin and chat captured string that you wish to convert to that output. And then I probably could help you to mangle it.


Вернуться к началу
 Профиль 
  
 Заголовок сообщения: Re: UTF-8 in Chat check letters
СообщениеДобавлено: 09 июн 2023, 03:00 
Не в сети
Аватара пользователя
Зарегистрирован:
22 окт 2014, 19:26
Последнее посещение:
20 мар 2024, 19:47
Сообщения: 1018
I think I did it above?
 
(Valve) : It's how letters and words appear from left to right in a normal chat
(Plugin) : After the modification to the chat through the functions available in the plugin
The results of the plugin are missing the last letter, either ignoring its replacement or returning it to its shape .. or any method that works
Цитата:
(Valve) Abdo : م ك ل ا ح () ف ي ك () م ك ي ل ع () م ال س
(Plugin) Abdo : سلامـ عليكمـ كيفـ حالكمـ
I want it like that : سلام عليكم كيف حالكم
You can notice the last letter of the words as i want them


Вложения:
Capture.PNG
Capture.PNG [ 1.36 КБ | Просмотров: 996 ]

_________________
https://vk.com/kgbaghl
Вернуться к началу
 Профиль 
  
 Заголовок сообщения: Re: UTF-8 in Chat check letters
СообщениеДобавлено: 13 июн 2023, 22:36 
Не в сети
Site Admin
Зарегистрирован:
01 июн 2010, 01:27
Последнее посещение:
26 мар 2024, 21:42
Сообщения: 6864
Byte-to-byte comparision of two texts reveals extra bytes.
Вложение:
RTL.jpg
RTL.jpg [ 98.36 КБ | Просмотров: 962 ]

Top text is "سلامـ عليكمـ كيفـ حالكمـ", bottom: "سلام عليكم كيف حالكم"
Try to remove these bytes from string.


Вернуться к началу
 Профиль 
  
Показать сообщения за:  Поле сортировки  
Начать новую тему Ответить на тему  [ Сообщений: 21 ]  На страницу 1, 2, 3  След.

Часовой пояс: UTC + 5 часов [ Летнее время ]


Кто сейчас на конференции

Сейчас этот форум просматривают: нет зарегистрированных пользователей и гости: 7


Вы не можете начинать темы
Вы не можете отвечать на сообщения
Вы не можете редактировать свои сообщения
Вы не можете удалять свои сообщения
Вы не можете добавлять вложения

Найти:
Перейти:  
Создано на основе phpBB® Forum Software © phpBB Group
Русская поддержка phpBB