![]() |
This nice little page is a Mac rumors site, here's a link to the current Apple 64-bit news and speculation. NOTE: This IS a Mac site, so heavy Mac byas and fanfare is here. If you don't kinda like Macs, don't read it.
http://www.spymac.com/forums/?board=...=10389;start=0 |
Comparing PPC to x86 is beyond comparing apples to oranges.
First, what is RISC and CISC? RISC and CISC are two different ways of processing. RISC follows the idea of a few instructions for programmers to use, so that the processor makers only have to design it for these few instructions. CISC processors use lots of instructions to make it easier on the programmer. Basically: CISC has many, defined instructions, that are the equivalent of smaller instructions put together. Its purpose was to make the programmer's life easier. However, with higher-level languages nobody codes totally in ASM anymore, but they do use it to optimize their programs. RISC has many, many small instructions. The emphasis is that the programmer will write more instructions. However, same with above. 1. PPC processors tend to be RISC-based. Providing that the 64-bit chip is still a true RISC processor, this adds a lot of complexity to the comparison. 2. x86 processors are generally a RISC-CISC mixture. They follow a CISC model, but they contain many enhancements available in a RISC processor. Essentially, all x86 (P4, Athlon) processors take complex instructions and break them down into reduced instructions, then feed them through a RISC core that produces the correct output. For example, let's multiply 2x2 CISC: MULT N1, N2 RISC LOAD A, N1 LOAD B, N2 PROD A, B STORE N1, A N1 and N2 are places in your computers memory. The CISC instruction takes the numbers N1 and N2, loads these numbers into the processors registers, multiplies them, and then stores the product in N1. The RISC instruction does the same thing as you can see, only longer. Nowdays, RISC processors have more and more instructions, however, all of them are computed directly by the processor. Some processors even have just as many instructions as CISC processors. CISC processors, as said above, have a RISC core and the instructions are translated. The line between RISC and CISC is not clearly defined at all. Any argument is hard to make. Quote:
The above statement is most definately not true and is misleading. Anyone who has taken an Assembly programming class will understand the basics of the instruction set (and I have, so I will try to explain it). Without confusing you, I will attempt to mutilate the explanation of processor architecture in an effort to allow you to understand it. Processors have areas inside of them where they can store things, kinda like memory, which are called registers. However, there is a very limited space where you can store this. These are used for adding things, counting things, and "pointing" to places that contain information in the memory. The reason they use the registers is because they are much faster than memory (but they are also more $$$ to make). These places in the memory are only 16 or 32 bits long, depending how the application is programmed and what processor you are using (x86 32 bit processors officially began with the Intel 80386). The 32 or 16 bits means that 1 instruction is made up of that many bits (that many 1's and 0's). With 32 different configurations of 1's and 0's, one can represent a # of up to (2^32 = 4,294,967,296), and 16 bits being (2^16 = 65,536). There are ways of counting beyond this #, but they are extremely complex and take much longer. A 64 bit processor, on the other hand, allows you to count up to (2^64 = 18,446,744,073,709,551,616). What do these numbers mean? Other than counting and adding things, they are also used in pointing (also called addressing) to places in memory. If you know anything about a computer, you know that it has memory where it stores all of it's information. The processor has to know where to get it, right? Well, that's where some of these registers come in. The computer points to these locations in groups, with each # representing a group, and 1 group = 8 bits = 1 byte. Hence it addresses in bytes. A 16 bit processor can address 65,536 bytes of memory, or 64 Kilobytes. A 32 bit can address 4,294,967,296 bytes of memory, or 4 Gigabytes. A 64 bit processor can address 4.5 Petabytes (Peta>Tera>Giga>Mega>Kilo). Basically, this says that 64 bit processors can count really high, and can see a lot more memory. However, programs must be rewritten to take advantage of these processors. Even then, the enhancements can only be used if the programmer originally exceeded the older processor limitations (say he needed a number bigger than 4GB and they coded a way around it, they'd need to recode it to take out that bypass). This means that you won't notice any improvement for word applications, most video applications, etc. You will notice the most improvement in databases and heavy design work. More accurate numbers (and more accurate decimals) can be stored. I believe the best quote is... "We need these processors to design tomorrow's processors". I forget where I saw this. More reading material: http://telnet7.tripod.com/articles/cisc_risc.htm |
I give major props to Jnadke, this guy knows his processors. But I will tell you now, that the PowerPC 4 chip by IBM is not going to be all that great. Why you may ask. A little thing we of the Apple inside world like to call magic, you heard it magic, other wise known as the Vector Velocity Engine. It elimintaes the bottle neck. The IBM chip does not have this little joy. The basic thing is that the Vector roughly doubles the processor speed, meaning that the Motorola G4 7455 with the Vector running at 1.0GHz, is in wintel terms running at about 2GHz, and for the newer 1.25GHz G4 processor, that is about 1.5GHz. The IBM chip does not have the Vector, from what I have been told from higher officials where I work. Meaning you are stuck at a lousy 1.8GHz and that is in the wintel world. But if they are talking in AMD terms which go claim to measure with the same standards with Apple, well then you are in luck, somewhat.
~Mr. Ivey |
you've got to be kidding me. spymac is the most unreliable piece of [beep] out there! haha
ohwell. and DJ, don't even get me started. if you knew how to use them, your macs probably wouldn't crash as much. |
Alright, for some reason I'm going to enter this fray and try and counter some of the misinformation that I've heard. First of all, the x86 architecture isn't really CISC and the PPC architecture really isn't RISC. Orginally, they were CISC and RISC respectively. However, they have mutated to more of a middleground. What you probably didn't know is that the x86 processors from Intel and AMD actually convert the x86 instruction set to a microcode before actually executing the instructions. This microcode is much closer to a RISC style architecture than CISC. Basically, everything after the decode unit is a RISC architecture. For PPC, ever since the AltiVec unit was added by Motorola, PPC really hasn't been all that much of a RISC design. There are too many specialized instructions for it to really qualify as RISC. Both of those said, there's really no way to compare them based on RISC/CISC anymore.
As far as to what 64-bits is, it refers to the size of a data word in the processor. A data word is the basic size of data that the processor operates on at one time. This also, in most cases, happens to refer to the memory pointer size; but not always, the Motorola 68000 is a 32-bit processor with only a 24-bit address line while the x86 is a 32-bit processor with a 36-bit address line (this results because of the segment/offset system that the x86 architecutre employs). Now, operating on 64-bit words isn't usually necessary. In fact, most code will only really need to operate on 16-bit values. There are some cases where operating on values as large as 64-bit is useful. However, the most significant improvement is because most 64-bit chips that are talked about having a 64-bit address line. This means that the chip can address up to 4 Terabytes of RAM (under 32-bits the limit is 4 Gigabytes). This isn't particularly useful for most applications at this point but it does have applications on large servers (and I'm not particularly talking about webservers; more data processing servers) and on high end graphics and CAD machines. For the average user, a 64-bit processor has next to no use at this point in time. Perhaps when 4 Gigabytes of RAM becomes common it will but that is still quite a ways off. And for those of you who think a 64-bit processor is new, it's not. I have a ten year old machine sitting next to me with a 64-bit processor on it (for the curious, it's a Sun Ultra 1 using the UltraSparc 1 processor). The big thing is that traditionally desktop processor companies are building 64-bit processors now. About the AltiVec Unit or VectorVelocity Unit: it is on this chip. Actually, it's more just speculation that it is (I haven't seen any official confirmations) but it has an added group of instructions that happen to be exactly one less than the AltiVec Unit produced by Motorola. While I don't know what that one instruction is, I doubt it's particularly critical to AltiVec performance. As to what the AltiVec Unit is, it is a vector processing engine. And that's about all I know about it. It performs certain vector operations in hardware as opposed to software. As far as comparing processors, it's a lot harder to compare to processors than to just compare their clock speed. On each clock tick, an instruction can (but not always will) move to the next stage in the pipeline of the processor. Multiple instructions can be in the same stage in the pipeline at the same time. For an idea of size of pipelines, the P4 has a 20 stage pipeline and the Athlon has a 12 stage pipeline. A few things can cause the pipeline to stall (meaning instructions don't advance). The primary reason being that the current instruction waits on the output from an instruction ahead of it in the pipeline that hasn't completed yet. Processors use out-of-order instruction processing to try and avoid this. Another reason the pipeline will stall is trying to access memory (namely RAM; RAM is very slow compared to processor speed). To get around this, there are several layers of cache which are much faster than RAM and can store parts of the RAM data closer to the CPU. However, cache is expensive and can't hold nearly as much data as RAM can. Various strategies are used to determine which data to cache. There also is such a thing as branch prediction. Because of the way the pipeline works, a branch-on-condition instruction can be part of the way through the pipeline. Instead of just stalling the pipeline, the processor will try and predict which way the branch will go. This is called branch prediction. When the processor guesses wrong, the branch prediction fails and the entire pipeline has to be flushed (it was executing the wrong instructions hence all that work has to be thrown away; the longer the pipeline the more harmful to speed this is). All of these issues come into play for processor speed and simply talking about clockspeed won't help much. Matt And for the curious, yes, I am a Computer Engineer. :) |
Thanks Matt Leese & Jnadke for that very through explanation
I know a lot, but defiantly not all that. One thing though, I though the L1 L2 and L3 Cache on the processor was responsible for storing the computations until it can transferred to the memory. That is why you can see a major performance boost when the L1 L2 and L3 Cache on processor has more megabytes to it. Heck what do know. About Velocity Engine in the new Power PC 970, it is rumored, as Matt Leese said that there is probably something that is compatible with the velocity engine. |
Quote:
|
Quote:
32 for the segment, and 32 for the offset = 36 bit addressing. Often times 32 is never used, but it's there for the needy. I was wrong in my last calculations. As for the 64bit memory addressing... early x86 implementations are expected to be smaller than 64bits... morely 40-48 bits. |
If the chip does have the AltiVec Vector Velocity Engine, then it will be fast as all get out. But remember this is an IBM chip, it will probably not have one, because Motorola owns AltiVec, they make the accellerator engine, do you really think they are going to give their technology freely to a competitor. I don't think so. But you must remember that there is a varation of the Vector, the VMX, that IBM still uses, it was when the G4 was first being worked with. IBM, Apple, Motorola, and Mecury Computer Systems were all partners with it, but now no longer work together as a large group. Motorola and Apple work together, and IBM is on it own, and I am no longer able to find the Mercury Computers Systems web site. But IBM has the VMX, the original core that the whole AltiVec Vector Velocity Engine is based on, but over the past few years, the Vector has been greatly updated, while the VMX stayed very much the same. Ergo, If the new PowerPC 4 IBM chip only has the VMX original, then there is no point in getting the chip, and you can wait till OS X Star Trek to run Mac with an IBM chip, the Motorola/AltiVec Vector PowerPC G4+ 7455 will be the better choice. But hey it's all linear on one level.
~Mr. Ivey |
As far as L1, L2, and L3 go, they are various forms of cache. Cache is just a way to store values from RAM "closer" to the processor. Basically, cache sits directly on the processor which means that the processor doesn't have to use the data bus to get access to data. L1 is made up of multi-transistor flip-flops (where a flip-flop is a way to store a single bit). It's very expensive in terms of chip real estate but it's also very fast. You will find the smallest amount of L1 cache on chips because it's so expensive. L2 cache is made up on single transistor flip-flops (I believe; I could be wrong on the exact number of transistors but it is a smaller amount than L1). Typically, processors have significantly more amounts of L2 than L1. L2 is somewhat fast but it takes a lot less chip real estate than L1. L3 is basically made of the same material as RAM. I know that this is generally made of 1 transistor flip-flops. L3 sits outside of the actual processor die but still on the same mounting. This makes it significantly slower than L1 or L2 but faster than RAM because the processor doesn't need to use the standard data bus to communicate with the RAM. Instead, it has a special mechanism to access the cache.
As to how cache works, cache consists of lines of memory. A cache line is a copy of a block of values in RAM. These values are generally contiguous and their size is fixed. When a memory read or write is specified, there is built-in hardware to check whether or not a that memory value is cached. If it is cached, it reads or writes to the appropriate cache instead of memory. If it reads or writes from an area not in cache, it will read or write directly to memory. However, if a read or write to a non-cached block of memory is specified, there's a high probability that that block of memory will then be cached. There are other strategies for determining what to cache that I won't go into. As to the AltiVec unit, I doubt that Motorola gave it to IBM. There's a fairly good chance that IBM developed their own method of implementing the AltiVec instructions and designed their chip with that method. The reason that one instruction isn't implemented may have to do with patent reasons. The other thing that I was going to add is that the newest thing in processor design is something called VLIW (Very Large Instruction Word). Intel calls this EPIC (Explicit Parallel Instruction Computing). It's being used in the Itanium series of 64-bit processor by Intel. How VLIW works is that it takes multiple instructions at a time (for Itanium it's three). The curious thing about it is that all three instructions must be independent of one another and must be able to be executed in any order (or at the same time). Because of this, Itanium does not do out-of-order processing of instructions. The assembly is supposed to specify instructions that can be executed out of order. What this means is that a lot of silicon can be saved by not implementing out-of-order instructions and, instead, the processor can be made faster (in general, smaller processors mean faster processors). It also saves on heat dissipation and can lower cost. The hard part about this is the fact that it requries the compiler to do a lot of work to find instructions that can be executed in parallel. It also makes assembly exteremely hard to program. Because of this, it hasn't really caught on yet. The other possible reason behind it is Itanium's high price and lack of performance. The Itanium 2 is supposed to correct some of these issues. Matt |
Motorola would never give the AltiVec over without a MULTI-BILLION dollar, or at least hundred million dollar payment. Anywhoo we are talking big mega bucks. Now IBM has the VMX, which is mainly audio/viedo, no speed enhancements, but the original AltiVec is basically the same as the VMX, just the AltiVec has evolved into twho major segments, software, and hardware, or the fourth lateral, as some insiders like to call. The VMX is just a vector on the processor that ups the A/V output. Which in all really isn't that great. They may have something, but not like the AltiVec.
~Mr. Ivey |
But what would the standard consumer use a 64-bit chip NOW?. Very little applications that regular Joe Moe of Boise Idaho would use. Quake, Tribes, MS Word, or AOL (no matter how crappy it is). The average consumer looks 2 months ahead. Will my fav DVD be out? What about a good game? If they have the choice of buying a 64 bit vs a 32 and the 32 is cheeper, it dosn't matter if the 64 bit can go 4 ghz, people will only need and use a 32 untill all apps go on to 64 bit.
|
Quote:
~Mr. Ivey |
Quote:
|
Quote:
Why does any normal person (not a hardcore gamer, mind you) need anything more than 400MHz with 128MB of RAM, maximum. Bigger numbers are better. Most people don't know that there're any differences between PPC clock cycles and x86 clock cycles, or the fact that, under certain circumstances, an AMD processor at a lower clock speed can outperform an Intel processor. A fool and his money are soon parted, and high end computers can get rather expensive, if you ask me... |
| All times are GMT -5. The time now is 16:14. |
Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi