Firstly, real time
networking simply does not exist for the basic fact that in any network situation you cannot guarantee the delivery of your information in a 'timely' manor; real time computing is often confused with high performance computing
. Secondly, one cannot 'exceed the NIC capacity' as a NIC is not like a water pipe that could burst if you shove too many bits down it; the 'capacity' (or bandwidth) of your NIC is a physical limiter of how fast that one connected interface can physically 'stream' bits (up/down) and is measured in mega bits per second (not mega BYTES as is a common misnomer .. just for clarity).
1) What happens when traffic exceeds the limit of the NIC capacity? In Windows, is each application's speed reduced equally? Or the application which started earlier has higher priority? How about in Mac or Linux?
Each application that you run that requests a 'network resource' (or what software engineers refer to as 'sockets') is capable of download/uploading at the max speed of the interface the application is 'bound' to (listening on), keep note that this kind of 'functionality' is more or less OS independent (i.e. a socket in Windows has the same basic limitations/concepts as ones in Linux/Mac). So if an application (say a web server) is running, it can theoretically 'serve' web pages at the max speed of the interface it's listening on (say 'nic 1' at 1Gb/s), which means that (in theory) it could receive 125MB file/request in 1 second as well as send a 125MB file/response at the same time (if full duplex). In reality though speeds are much much lower for many reasons...
Given this, say our 'perfect' web server now needs to also handle FTP requests and thus we load an FTP server on the same machine that will listen on the same NIC ('nic 1' in our example). When the FTP server application launches, it too will be capable of receiving and sending up to the theoretical max of the NIC (125MB/s in our example). This is true for ANY application you run (web browser, video game, Netflix/Hulu, etc. etc.). Where the crux comes is now I have 2 or more applications (web and ftp in our example) that are capable of using the max bandwidth of the NIC, so what happens to each and who gets what? This is where general networking and OS implementation comes in.
In our example, say there are 80 people downloading at 10Mb/s from our web server (total of 800Mb/s), then 30 more people come in and start downloading from the FTP server at 10MB/s each for a total 1100Mb/s; in theory what should happen is that each applications will get an equal share of the 1000Mb bandwidth that the NIC can physically handle (similar to how the OS would 'share' the CPU among the running applications up to 100% CPU load). In reality there are a LOT of varying factors that will decide which application uses what amount of bandwidth at a given instant in time. These factors can range from number of users requesting resources to simple hardware issues 3 network hops downstream and these types of issues are beyond the control of you and what your NIC can do, also note that which application started first or who connected first make no difference in bandwidth allocation. In this 1100Mb/s scenario it's very likely that the web server could be going at an effective rate of 60% bandwidth while the FTP is going at 40% when a user could drop or some other networking issue could occur and the bandwidth could change for the web server to using only 20% and the FTP could continue at it's 40% rate thus limiting the user(s) on the receiving end of the web server.
Keep in mind these are all generic scenarios to give a better understanding and 'real world' cases are completely different and require a lot of planning and design work to ensure you can meet the bandwidth needs of your user base (at least attempt to meet).
2) When downloading data from the Internet, I think browsers download it as fast as possible. How much does this action influence networking of real time applications?
Browsers do not download as fast as 'they' can, they download as fast as any other application can and they affect the total bandwidth of the system just as any other application would too (see answer to 1).
3) In windows it is possible to configure the priority of processes. Isn't it possible to set priority of networking similarly? Again how about Mac or Linux? (I know an third-party application to set network priority in windows, but the os doesn't support?)
Setting the 'priority' of 'networking' is possible, but not in the same context as setting process priority. Setting 'network priority' can mean a couple of things, like setting the priority of a NIC over another one (can be done in Win/Mac/Linux and is also known as setting the 'network interface metric'). If your intention though is to set the priority of networking of a specific process or 'type of network traffic', this is known as 'Quality of Service (QoS)' and can be applied on Win/Mac/Linux as well (though the setup and configuration of QoS on your platform is beyond the scope of this question/answer). QoS lets you limit/rate the bandwidth alloted to specific things; for instance you could setup QoS to allow max bandwidth for torrent downloading when nothing else is downloading, then auto limit it to only 5% total bandwidth allocation when other network activity starts going to ensure your torrent downloads aren't eating up all of your bandwidth while your trying to check your email or watch YouTube (just as an example).
4) Each IP header has the TOS field including the priority setting. I heard most operating systems and routers ignore this field. Is that true?
The 'Type of Service' field as defined by these 2 RFC's has since morphed into what is known as 'Differentiated Services Code Point'. It is not ignored as much as it's just 'not used'. In other words, just because a packet HAS the fields set does not mean you (as the implementer of the networking device) have to actually 'do anything' with it, ToS is a part of QoS and thus your specific OS/device must support the ToS/DS fields. That being said, most OS's/network devices have some capabilities to allow you to use these fields via some QoS policies (at least I don't know of any modern instances that don't have some sort of QoS implementation).
To re-iterate, there is no such notation as 'real time networking', instead you would say 'low-latency' (or even 'near real time') as 'real time' implies a guaranteed time delivery of a requested/implemented operation. You cannot guarantee that any operation performed on a network will have a specific time delivery of X, instead you could say that a network operation can be guaranteed to have no higher/lower latency than X at a given node.
I also suggest reading up on QoS as it seems that might be what your after given the context of your question and QoS can get complex depending on your needs.
I hope that can help.