Go to Post When you tell any of your kids that winning isn't important, or isn't [I]that[/I] important, the flapping sound you hear is your credibility flying away. - jlindquist74 [more]
Home
Go Back   Chief Delphi > Technical > Programming > Java
CD-Media   CD-Spy  
portal register members calendar search Today's Posts Mark Forums Read FAQ rules

 
Reply
Thread Tools Rate Thread Display Modes
  #1   Spotlight this post!  
Unread 13-03-2011, 12:23
drakesword drakesword is offline
Registered User
AKA: Bryant
FRC #0346 (Robohawks)
Team Role: Mentor
 
Join Date: Jan 2006
Rookie Year: 2004
Location: USA
Posts: 200
drakesword is on a distinguished road
Reading characters from socket wierdness

Anyone care to explain this one to me?

Send a character value of 1 receive a 1
. so 1 -> 1
2 -> 2
so on until I tried 150 then it receives a 194 150 as two separate characters
. so 150 -> 194 150
175 -> 194 175
180 -> 194 180

Tried the BufferedReader without the cRio (on windows and linux) and send a 180 receive a 180.
Reply With Quote
  #2   Spotlight this post!  
Unread 13-03-2011, 15:59
buchanan buchanan is offline
Registered User
FRC #2077 (Laser Robotics)
Team Role: Mentor
 
Join Date: Mar 2009
Rookie Year: 2007
Location: Wales, WI
Posts: 70
buchanan is just really nicebuchanan is just really nicebuchanan is just really nicebuchanan is just really nice
Re: Reading characters from socket wierdness

Characters are converted to and from numeric codes via mapping tables (Charsets in Java), and systems may differ (or be configured differently) in which one they use by default. Two common ones are ISO-8859-1 (aka Latin-1) and UTF-8. These use the same mappings for numeric codes 0-127 (covering most common English text and Java source), but differ in their encoding of other characters. For example, the character (an accent symbol) that's encoded as a single byte with value 0xB4 (180) in Latin-1 is encoded as the two-byte sequence 0xC2B4 (194 180) in UTF-8.

If your intent is just to transfer raw data across the socket, stay away from anything involving characters and character sets, and just send byte sequences.
Reply With Quote
  #3   Spotlight this post!  
Unread 13-03-2011, 23:21
drakesword drakesword is offline
Registered User
AKA: Bryant
FRC #0346 (Robohawks)
Team Role: Mentor
 
Join Date: Jan 2006
Rookie Year: 2004
Location: USA
Posts: 200
drakesword is on a distinguished road
Re: Reading characters from socket wierdness

I understand that. I was sending a byte value wrapped in character form to avoid signing. I tried sending the data as bytes and reading as bytes and still got the same result.
Reply With Quote
  #4   Spotlight this post!  
Unread 14-03-2011, 09:55
buchanan buchanan is offline
Registered User
FRC #2077 (Laser Robotics)
Team Role: Mentor
 
Join Date: Mar 2009
Rookie Year: 2007
Location: Wales, WI
Posts: 70
buchanan is just really nicebuchanan is just really nicebuchanan is just really nicebuchanan is just really nice
Re: Reading characters from socket wierdness

Want to post some code?
Reply With Quote
  #5   Spotlight this post!  
Unread 14-03-2011, 10:08
drakesword drakesword is offline
Registered User
AKA: Bryant
FRC #0346 (Robohawks)
Team Role: Mentor
 
Join Date: Jan 2006
Rookie Year: 2004
Location: USA
Posts: 200
drakesword is on a distinguished road
Re: Reading characters from socket wierdness

Sure

On the robot.
Code:
*snip*
client = scn.acceptAndOpen();
reader = new BufferedReader(new InputStreamReader(client.openInputStream()));
*snip*
char[] data = reader.readLine().toCharArray();

for(lcv = 0; lcv < dat.length; lcv++)
{
    System.out.print((int)data[lcv] + " ");
}
System.out.println();

On the computer

Code:
*snip*
out = new BufferedWriter(new OutputStreamWriter(out.getOutputStream()));
*snip*
out.write(new char[]{1,2,3,100,150,180});
out.write("\n");
out.flush();
So I am sending 1 2 3 100 150 180
robot reports it is receiving 1 2 3 100 194 150 194 180

Even larger numbers have other strangeness with them. Sent a 200 and the robot received a 131 199
Reply With Quote
  #6   Spotlight this post!  
Unread 14-03-2011, 13:03
buchanan buchanan is offline
Registered User
FRC #2077 (Laser Robotics)
Team Role: Mentor
 
Join Date: Mar 2009
Rookie Year: 2007
Location: Wales, WI
Posts: 70
buchanan is just really nicebuchanan is just really nicebuchanan is just really nicebuchanan is just really nice
Re: Reading characters from socket wierdness

In Java a char is a 16-bit value assumed to mean a unicode character or "code point". This is an internal representation, and any time you do I/O through encoding-aware classes (Input/OutputStreamReader/Writer) it gets converted to or from an external representation. This can be either explicitly specified or taken from the platform's default. If the reader doesn't use the same encoding as the writer, mismatches occur and the reader doesn't get out the same internal "char" values the writer put in. Both UTF-8 and forms of LATIN-1/ISO-8859-1 are in common use as defaults, so relying on defaults is dangerous when passing data between dissimilar machines. What's insidious is that these two encodings, though strictly speaking incompatible, actually do map 0-127 the same way, so programs only passing code points in this range appear to work, even if they're mismatched.

Below is some code you can play with to observe the various interactions, but the takeaways are 1) Don't use encoding-aware APIs unless what you're passing really is text data, and 2) If you are passing encoded text between different platforms, specify the encoding explicitly.
Code:
/////////////////////////
import java.net.*;
import java.io.*;
import java.nio.charset.*;

public class Reader
{
	// reads a character sequence presumed to be in the platform's default character encoding
	public static void main(String[] argv)
	{
		try {
			System.out.println(Charset.defaultCharset());
			// run w/ -Dfile.encoding=UTF-8 or -Dfile.encoding=ISO-8859-1 on the command line to change the above
	
			ServerSocket ss = new ServerSocket(0);
			System.out.println(ss.getLocalPort());
			Socket s = ss.accept();
			System.out.println(s.getPort());

			BufferedReader reader = new BufferedReader(new InputStreamReader(s.getInputStream())); // uses Charset.defaultCharset()
			//BufferedReader reader = new BufferedReader(new InputStreamReader(s.getInputStream(), "ISO-8859-1")); // explicitly specifies encoding
			char[] data = reader.readLine().toCharArray(); // convert the incoming encoded sequence assuming it's in "our" encoding
			// if our encoding matched the writer's (whatever it was) all is well
			// if there's a mismatch, we get various kinds of garbage, depending on who used what
			// for UTF-8/ISO-8859-1 mismatches, the garbage only shows up in code points > 127, since their encodings happen to match for 0-127
			for(int i = 0; i < data.length; i++) {
				System.out.print((int)data[i] + " ");
			}
			System.out.println();
		}
		catch (Exception ex) {
			ex.printStackTrace();
		}
	}
}
/////////////////////////
public class Writer
{
	// writes a character sequence in the platform's default character encoding
	public static void main(String[] argv) // supply the Reader port number in argv[0]
	{
		try {
			System.out.println(Charset.defaultCharset());
			// run w/ -Dfile.encoding=UTF-8 or -Dfile.encoding=ISO-8859-1 on the command line to change the above
		
			Socket s = new Socket(InetAddress.getLocalHost(), Integer.parseInt(argv[0]));
			System.out.println(s.getLocalPort());

			BufferedWriter out = new BufferedWriter(new OutputStreamWriter(s.getOutputStream())); // uses Charset.defaultCharset()
			//BufferedWriter out = new BufferedWriter(new OutputStreamWriter(s.getOutputStream(), "ISO-8859-1")); // explicitly specifies encoding
			char[] data = {1, 2, 3, 100, 150, 180}; // a "char" is a 16-bit unicode "code point"
			out.write(data); // the OutputStreamWriter encodes the chars in its charset
			// under UTF-8, the last line writes 1 2 3 100 194 150 194 180 (4 8-bit values and 2 16-bit)
			// under ISO-8859-1 it's 1 2 3 100 150 180 (all 8-bit values)
			out.write("\n"); // writes a 10 (in either encoding)
			out.flush();
		}
		catch (Exception ex) {
			ex.printStackTrace();
		}
	}
}
/////////////////////////
public class RawReader
{
	// reads a stream of bytes; nothing here is affected by the JVM's default encoding
	public static void main(String[] argv)
	{
		try {
			System.out.println(Charset.defaultCharset());
		
			ServerSocket ss = new ServerSocket(0);
			System.out.println(ss.getLocalPort());
			Socket s = ss.accept();
			System.out.println(s.getPort()); // pass to Writer in argv[0]

			InputStream in = s.getInputStream();
			for (int i = in.read(); i != -1; i = in.read()) {
				System.out.print(i + " ");
			}
			System.out.println();
		}
		catch (Exception ex) {
			ex.printStackTrace();
		}
	}
}
/////////////////////////
public class RawWriter
{
	// writes a stream of bytes; nothing here is affected by the JVM's default encoding
	public static void main(String[] argv) // supply the Reader port number in argv[0]
	{
		try {
			System.out.println(Charset.defaultCharset());
		
			Socket s = new Socket(InetAddress.getLocalHost(), Integer.parseInt(argv[0]));
			System.out.println(s.getLocalPort());

			OutputStream out = s.getOutputStream();
			byte[] data = {(byte)1, (byte)2, (byte)3, (byte)100, (byte)150, (byte)180}; // bytes are 0-255 integers
			out.write(data); // no encoding happens here
			//out.write((byte)10); // if we add the EOL (10) here we can duplicate the output of -Dfile.encoding=ISO-8859-1 Writer
			out.close();
		}
		catch (Exception ex) {
			ex.printStackTrace();
		}
	}
}
/////////////////////////
Reply With Quote
  #7   Spotlight this post!  
Unread 14-03-2011, 14:17
derekwhite's Avatar
derekwhite derekwhite is offline
Java Virtual Machine Hacker
no team (FIRST@Oracle)
Team Role: Programmer
 
Join Date: May 2009
Rookie Year: 2009
Location: Burlington, MA
Posts: 127
derekwhite is on a distinguished road
Dealing with UNsigned values in Java

I agree that reading and writing bytes is what you want here. To get around the "sign extension" problem requires an extra operation...

To recap, most integral types in Java are signed:
byte - signed 8-bit (-128..127)
short - sign 16-bit (-32768..32767)
char - unsigned 16-bit (0..65535)
(mostly for unicode character set, but you can use them as a numeric type)
int - signed 32-bit (-2147483648..2147483647)
long - signed 64-bit (0x8000000000000000L..0x7fffffffffffffffL) (the decimal values are getting less useful here!)

Compared to C, Java is nice in that the limits for each type are the same on every platform. BUT - it's much more of a pain to deal with unsigned values in Java. Also note that Java's "char" is nothing like a C "char".

To deal with unsigned values in Java you need to promote the type to a "larger" type, then mask the value back to an 8 (or 16, or 32) bit range.

BYTE:
int unsignedVal = signedValue & 0xFF;
SHORT:
int unsignedVal = signedValue & 0xFFFF;
INT:
long unsignedVal = signedValue & 0x0FFFFFFFFL;

To extend buchanan's RawReader example:

Code:
InputStream in = s.getInputStream();
for (int i = in.read(); i != -1; i = in.read()) { // read signed byte
	i = i & 0xFF;                                       // convert to unsigned byte value
	System.out.print(i + " ");
}
System.out.println();

Last edited by derekwhite : 14-03-2011 at 14:24.
Reply With Quote
  #8   Spotlight this post!  
Unread 14-03-2011, 23:18
drakesword drakesword is offline
Registered User
AKA: Bryant
FRC #0346 (Robohawks)
Team Role: Mentor
 
Join Date: Jan 2006
Rookie Year: 2004
Location: USA
Posts: 200
drakesword is on a distinguished road
Re: Reading characters from socket wierdness

I had at one point sent raw byte data as well. But the fact remains that when you use read() or readLine() the api returns 2 bytes of data for the larger values whereas the documentation says it returns an int between 0 and 255
. . . I will try again when I have access to the robot.


So essentially I need to select the "other" encoding. To help reduce ping-ponging code debugging. Does anyone know what encoding the cRio natively reads so I can set it in the OutputStreamWriter?

Last edited by drakesword : 14-03-2011 at 23:35.
Reply With Quote
Reply


Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 13:02.

The Chief Delphi Forums are sponsored by Innovation First International, Inc.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi