![]() |
Reading characters from socket wierdness
Anyone care to explain this one to me?
Send a character value of 1 receive a 1 . so 1 -> 1 2 -> 2 so on until I tried 150 then it receives a 194 150 as two separate characters . so 150 -> 194 150 175 -> 194 175 180 -> 194 180 Tried the BufferedReader without the cRio (on windows and linux) and send a 180 receive a 180. |
Re: Reading characters from socket wierdness
Characters are converted to and from numeric codes via mapping tables (Charsets in Java), and systems may differ (or be configured differently) in which one they use by default. Two common ones are ISO-8859-1 (aka Latin-1) and UTF-8. These use the same mappings for numeric codes 0-127 (covering most common English text and Java source), but differ in their encoding of other characters. For example, the character (an accent symbol) that's encoded as a single byte with value 0xB4 (180) in Latin-1 is encoded as the two-byte sequence 0xC2B4 (194 180) in UTF-8.
If your intent is just to transfer raw data across the socket, stay away from anything involving characters and character sets, and just send byte sequences. |
Re: Reading characters from socket wierdness
I understand that. I was sending a byte value wrapped in character form to avoid signing. I tried sending the data as bytes and reading as bytes and still got the same result.
|
Re: Reading characters from socket wierdness
Want to post some code?
|
Re: Reading characters from socket wierdness
Sure
On the robot. Code:
*snip*On the computer Code:
*snip*robot reports it is receiving 1 2 3 100 194 150 194 180 Even larger numbers have other strangeness with them. Sent a 200 and the robot received a 131 199 |
Re: Reading characters from socket wierdness
In Java a char is a 16-bit value assumed to mean a unicode character or "code point". This is an internal representation, and any time you do I/O through encoding-aware classes (Input/OutputStreamReader/Writer) it gets converted to or from an external representation. This can be either explicitly specified or taken from the platform's default. If the reader doesn't use the same encoding as the writer, mismatches occur and the reader doesn't get out the same internal "char" values the writer put in. Both UTF-8 and forms of LATIN-1/ISO-8859-1 are in common use as defaults, so relying on defaults is dangerous when passing data between dissimilar machines. What's insidious is that these two encodings, though strictly speaking incompatible, actually do map 0-127 the same way, so programs only passing code points in this range appear to work, even if they're mismatched.
Below is some code you can play with to observe the various interactions, but the takeaways are 1) Don't use encoding-aware APIs unless what you're passing really is text data, and 2) If you are passing encoded text between different platforms, specify the encoding explicitly. Code:
///////////////////////// |
Dealing with UNsigned values in Java
I agree that reading and writing bytes is what you want here. To get around the "sign extension" problem requires an extra operation...
To recap, most integral types in Java are signed: byte - signed 8-bit (-128..127) short - sign 16-bit (-32768..32767) char - unsigned 16-bit (0..65535) (mostly for unicode character set, but you can use them as a numeric type) int - signed 32-bit (-2147483648..2147483647) long - signed 64-bit (0x8000000000000000L..0x7fffffffffffffffL) (the decimal values are getting less useful here!) Compared to C, Java is nice in that the limits for each type are the same on every platform. BUT - it's much more of a pain to deal with unsigned values in Java. Also note that Java's "char" is nothing like a C "char". To deal with unsigned values in Java you need to promote the type to a "larger" type, then mask the value back to an 8 (or 16, or 32) bit range. BYTE: int unsignedVal = signedValue & 0xFF; SHORT: int unsignedVal = signedValue & 0xFFFF; INT: long unsignedVal = signedValue & 0x0FFFFFFFFL; To extend buchanan's RawReader example: Code:
InputStream in = s.getInputStream(); |
Re: Reading characters from socket wierdness
I had at one point sent raw byte data as well. But the fact remains that when you use read() or readLine() the api returns 2 bytes of data for the larger values whereas the documentation says it returns an int between 0 and 255
. . . I will try again when I have access to the robot. So essentially I need to select the "other" encoding. To help reduce ping-ponging code debugging. Does anyone know what encoding the cRio natively reads so I can set it in the OutputStreamWriter? |
| All times are GMT -5. The time now is 10:40. |
Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi