Bummer! This is just a preview. You need to be signed in with a Basic account to view the entire video.
Start a free Basic trial
to watch this video
Learn about the char type and the byte type, and a bit about text encoding.
-
0:00
In this video, we'll be using the C# interactive window,
-
0:03
which is a REPL feature inside Visual Studio.
-
0:07
To open it up, we'll go to View > Other Windows > C# Interactive.
-
0:14
Inside this window, we can execute C# code like we can in a workspaces console.
-
0:20
I mentioned earlier that our stream reader deals with encoding.
-
0:24
Let's take a look at our text file.
-
0:27
It contains a string called Hello, world!
-
0:30
So what is a string?
-
0:31
Well, a string is made up of one or more characters.
-
0:35
That brings us to the char type.
-
0:38
The char is a type that represents a single character.
-
0:41
The way we assign a value to a char type in C# is similar to a string, but
-
0:45
instead of using double quotes, we use single quotes.
-
0:49
So here's how we'd create a char variable.
-
0:53
char capitalH equals,
-
0:56
single quote, H, and semicolon.
-
1:02
capitalH.
-
1:05
Let's go find the documentation on the char type.
-
1:08
C# char.
-
1:12
Here's something.
-
1:15
This page is for the keyword.
-
1:16
We can click on System.Char to get to the actual type.
-
1:21
So check this out.
-
1:23
It's actually a struct and not a class.
-
1:26
A struct is a lot like a class, but it has some limitations.
-
1:30
We'll get into a little more about structs later on.
-
1:33
It says, Represents a character as a UTF-16 code unit.
-
1:39
So what does it mean by UTF-16 code unit?
-
1:42
UTF-16 is a Unicode character encoding format.
-
1:47
Without going into too much detail, It's good to keep in mind that
-
1:50
every piece of text, even every character, has some kind of encoding behind it.
-
1:56
This is because computers only know how to deal with numbers, and
-
1:59
they need some way to translate the numbers into characters.
-
2:03
Encoding formats are kind of like the Rosetta Stone for computers.
-
2:07
Unicode formats have many characters in their sets, and each character has a code,
-
2:12
sometimes called a control code or a codepoint.
-
2:16
This is so we can accommodate languages that have way more characters than
-
2:20
the standard English alphabet.
-
2:22
Let's look up the Unicode character for the lower letter h.
-
2:26
Do unicode letter h.
-
2:30
All right, here's the capital letter H.
-
2:32
Let's see if we can get to it from there.
-
2:36
Lowercase, U+0068.
-
2:39
We can use this value to create a char variable.
-
2:45
We'll do char lowerH equals, single quote, and then a backslash,
-
2:52
u, and then I'll paste in that code we copied from the web page.
-
2:58
The backslash is indicating an escape sequence like in our directory string,
-
3:03
and the 0068 is a hexadecimal value that represents our lower letter h.
-
3:08
Each character in C# is encoded as two bytes in the default encoding of UTF-16.
-
3:14
lowerH.
-
3:18
Let's try getting the underlying bytes of our lower letter h.
-
3:21
byte[] unicodeBytes.
-
3:26
First we'll need to specify an encoding.
-
3:28
So UnicodeEncoding and
-
3:31
Unicode.GetBytes.
-
3:38
And it wants a character array, so I'll pass it new character array.
-
3:44
And we'll fill it with the lowerH.
-
3:50
Okay, let's see what it's got.
-
3:54
And there's our two bytes.
-
3:56
Now we can convert them back into a string.
-
3:59
string unicodeString equals
-
4:03
UnicodeEncoding.Unicode.GetString, and
-
4:10
we'll pass it the unicodeBytes.
-
4:16
And Unicode string has our letter h.
-
4:22
And notice it's a string because, it's got double quotes.
-
4:26
So what is the byte type exactly?
-
4:28
In C#, a byte is an integral type, an unsigned eight-bit integer.
-
4:33
Integral types represent whole numbers and have a minimum and a maximum value.
-
4:39
Unsigned means that it can only contain positive values.
-
4:43
An unsigned eight-bit integer can store values from 0 to 255.
-
4:48
Conversely, when you see that a type is signed, it means that it can have a range
-
4:52
of values from a negative number to a positive number.
-
4:55
A signed byte in C#, declared as sbyte, is also eight-bit, but
-
5:00
it can have a minimum value of a negative 128 and a maximum value of 127.
-
5:06
Let's create one.
-
5:07
sbyte signedByte = -128.
-
5:14
If we tried to assign it a constant value that's out of range, like 200,
-
5:19
the compiler wouldn't let us.
-
5:21
Let's try it.
-
5:22
sbyte signedByte = 200.
-
5:29
We'll get into other signed and unsigned integral types later in this course.
-
5:34
How about a character that's not usually on a keyboard, like the degree symbol,
-
5:38
the unit symbol for a temperature?
-
5:41
Let's go look it up.
-
5:43
Unicode degree symbol.
-
5:45
Here it is.
-
5:49
00B0.
-
5:52
Gonna copy that with Ctrl+C.
-
5:56
Get back to our code.
-
5:57
So here I'll say char degree equals, single quote,
-
6:02
backslash, u, and I'll paste in our code from the Unicode page.
-
6:09
Okay, now let's see what it looks like when the console prints it out.
-
6:13
Console.WriteLine.
-
6:15
We'll type a sentence, The current temperature
-
6:22
is 74.6, then we'll insert our degree symbol.
-
6:29
Whoops, I got an equal sign in here,
-
6:34
needs to be a plus, Fahrenheit.
-
6:42
The current temperature is 74.6 degrees Fahrenheit.
-
6:46
In the .NET framework, when strings are created in memory,
-
6:50
their default encoding is UTF-16.
-
6:53
You'll also see UTF-8, which is usually the encoding for text files.
-
6:58
In IO, we can specify different encodings if we need to.
-
7:01
But you usually don't have to worry about it unless you're dealing with
-
7:04
communicating with other systems that may need a different encoding.
-
7:07
Check out the notes if you want to read more about different encodings in .NET.
You need to sign up for Treehouse in order to download course files.
Sign up