Hello everyone, this is Bo2SS~ Time flies, it's been a year since graduation, and the company has injected fresh blood again. There is a big front-end newcomer training in the department, and I bravely signed up to share some knowledge related to HTTP. In fact, I had not systematically studied HTTP before, so I prepared for this sharing for two months in advance. After the sharing last week, according to the feedback from the questionnaire, everyone gave a five-star 🌟 rating, so I’ll record it here~
Preface#
As the title suggests, today we will talk about HTTP and HTTPS in network communication.
Q: Why share this topic?
-
🫡 Common in life. HTTP is very common throughout the internet, for example, when we watch dramas, scroll through short videos, or program facing Google, we will use it. As developers, we have the obligation to understand it in depth.
-
🤔 Common in work. In our work, we often encounter related issues, such as when debugging front-end and back-end interfaces, if we encounter unexpected situations, the first thing we need to pay attention to is some information in the HTTP request. We should be familiar with its structure and some specifications.
-
📖 Ideas to reference. HTTP has developed for more than 30 years, with three major versions, and many of its design ideas are worth referencing in our development.
Q: What materials did you refer to?
I did a lot of preparatory work before this sharing, mainly referring to:
-
The course “Understanding the HTTP Protocol” on Geek Time. I might have listened to this series about ten times. If anyone wants to delve into more details, you can check it out. The author also provides a practical learning repository chronolaw/http_study, through which you can easily set up a web server and access the resources inside via a browser to understand HTTP.
-
Xiaolin Coding. This is a public account I have been following for a long time, which has many illustrated articles about computer fundamentals. It now also has a Xiaolin Coding website version.
-
Bo2SS. This is my own public account, right here👋. I have previously written two articles related to HTTPS:
In addition, some third-party images are also cited in the article, and I won't list the sources one by one. If there are any inappropriate parts, please feel free to contact me.
Q: What is the goal of this sharing?
-
🔍 Quickly locate HTTP issues. As mentioned earlier, when we are debugging front-end and back-end, we may often encounter situations where the results do not meet expectations. We should first be able to quickly locate whether the issue is on the back-end or front-end through the status code.
-
🥣 Familiarize with common header fields in HTTP messages. By familiarizing ourselves with common header fields, we can not only master the basic functions of HTTP but also learn many design ideas of HTTP. Where can we see this message? Generally, each end will have packet capture tools to view it.
-
🔐 Understand basic encryption knowledge. In the internet age, user privacy and business confidentiality are very important.
🏁Ultimate goal: After reading this article, you will have the ability to independently delve into HTTP, such as using WireShark, Chrome, Telnet tools, and even looking at RFC documents, which contain almost all important information related to the network.
➕ Some materials: User guides for various tools (WireShark, Chrome, Telnet, RFC document summary.
Q: What content will be shared this time?
This is the outline of this sharing:
In short, today we will discuss what HTTP and HTTPS are, and how they have developed.
Alright, let's get to the main topic:
What is HTTP?#
What is HTTP, and what is it not?#
HTTP stands for Hypertext Transfer Protocol, which means hypertext, transfer, and protocol. Let's explain it from the back to the front~
-
Protocol. What is a protocol? We can think of our rental agreement, tripartite agreement, which actually have the same meaning. The "agreement" in protocol represents that there are two or more participants, and the "discussion" represents the agreements and specifications, stipulating what you can do and what you cannot do.
-
Transfer. Then there is transfer, which we can relate to express delivery, specifically transferring between two points. The key points are two: First, it is bidirectional; we can send and receive packages; second, the transfer process can have intermediaries. For example, when we send a package, it goes through the courier, the express company, the logistics warehouse, etc., before reaching the recipient, and all these intermediaries also follow the protocol.
-
Hypertext. Finally, regarding hypertext, it refers to text that transcends ordinary text. Here I want to ask everyone a question: Besides text, images, audio, and video formats, what is the most critical format of hypertext? That's right, it's hyperlinks. Hyperlinks allow us to jump from one "hypertext" to another, transforming our text from a linear structure to a non-linear web structure.
In summary: “HTTP is a set of agreements and specifications for transferring hypertext data such as text, images, audio, and video between two points in the computer world.”
This image shows the participants in the basic HTTP communication process. From this image, we can clarify what HTTP is not, thus gaining a clearer understanding of HTTP.
-
HTTP is not an entity, such as the web browser on the left (sender) and the web server on the right (receiver).
-
HTTP is not the internet; the hypertext resources transmitted by HTTP are just a part of internet resources.
-
HTTP is not a programming language, but HTTP supports various programming languages for implementation.
-
HTTP is not HTML; HTTP can transmit HTML, and HTML is a common format for hypertext.
-
HTTP is not an isolated protocol. Typically, there are some underlying protocols supporting HTTP, such as TCP, IP, DNS, etc.; above HTTP, there are also some protocols that depend on HTTP, such as WebSocket, HTTPDNS, etc. These protocols are interwoven, forming a network of protocols, with HTTP at the center.
Overview of the HTTP World#
Let's take a look at the overall picture of the HTTP world, mainly divided into application-related and theory-related. With a more macro understanding of the HTTP communication link, we can more clearly identify which link may cause the problem when locating issues.
HTTP-related Applications#
Looking from right to left:
-
Internet - WWW: The internet is the internet, storing various information resources. WWW is a subset of the internet, short for the World Wide Web, which is based on HTTP, so it stores hypertext resources, accounting for about 90% of resources on the internet.
-
Web Browser: The web browser is the requester in the HTTP communication process and can display the requested resources.
-
Web Server: The web server is the responder in the HTTP communication process, managing network resources. It is generally divided into hardware and software; hardware refers to physical servers, cloud servers, etc., while software refers to applications like Nginx, Apache, etc.
-
CDN: Content Delivery Network, which acts as an intermediary in the HTTP transfer process, serving as a network proxy. It can cache server resources, speed up network responses, and provide load balancing and security protection capabilities.
-
Crawler: This refers to web crawlers. Similar to web browsers, it can also be understood as a type of user agent, generally used by major search engines to automatically crawl data and store it in databases.
-
Others: Other components include HTML, web services, WAF. Web services can be understood as specific services or service development specifications running on web servers. WAF stands for Web Application Firewall, which is also a type of proxy.
HTTP-related Theories#
Again, looking from right to left, the right side shows HTTP/1.1, HTTPS, HTTP/2, and HTTP/3, which are the main protocols we will discuss today. On the left:
- TCP/IP: It actually represents a protocol stack that contains many network communication protocols.
Here we draw an analogy between the HTTP communication process based on TCP/IP and express delivery:
1)Hypertext => MAC: In the left image, the column on the left, the hypertext to be transmitted goes from the application layer to the link layer, and each layer adds the corresponding header, such as HTTP header, TCP header, IP header, MAC header, just like the packaging process of express delivery;
2)MAC => Hypertext: In the right column of the left image, the data being transmitted has a header removed at each layer, just like the unpacking process of express delivery.
- URI - URL: URI (I - Identifier) is a Uniform Resource Identifier, which is divided into URL and URN forms, but since the latter is not commonly used in the internet world, URI generally refers to URL.
1)URL (L - Locator): The address at the top of our browser is the URL.
Its basic components are shown in the image above; we can first focus on the red box part:
-
scheme: The leftmost scheme represents the protocol, such as http, https, ftp, etc. Note that the
://
symbol immediately following the protocol is fixed and necessary. -
host: The middle host is the hostname, also called the domain name, which will be elaborated on when discussing DNS.
-
path: The last part, the path, represents the resource path.
Q: Here’s a question, is the domain name in the example URL www.creatorseo.com/
?
The answer is no; the trailing slash /
belongs to the path, representing the root directory of the accessed host. Since most computers on the early internet were UNIX systems, the path format here adopts the UNIX file path style.
There is another image, which shows the complete format of the URL.
It includes three more components than the previous image:
-
user@: We can fill in user password information in the URL, but it is no longer recommended due to security reasons.
-
?query: This part can add some additional requirements for the resource, starting with
?
, consisting of multiple key-value pairsk=v
, with each pair connected by&
. -
#fragment: It represents a fragment identifier, which we can understand as an anchor within the resource, used by the client and not sent to the server. When we read some blogs (like clicking on a title in the floating directory to jump to my blog page), this part is used.
💡 Here are two small reminders~ One is that after the host, you can also specify the port using :port
. The other is that when discussing URLs, we often talk about escape and encode concepts, because without them, the server might not be able to correctly process the URL. Think about it: if the path also contains a ?
symbol, how would the server parse the starting position of the query?
-
escape - escaping: For special characters, we generally escape them by converting them directly into their ASCII hexadecimal code and adding a
%
prefix, for example,SPACE
corresponds to%20
,?
corresponds to%3F
; -
encode - encoding: For Chinese characters and other languages, we generally perform UTF-8 encoding first, then escape. If you don't believe it, try copying and pasting a URL containing Chinese characters into WeChat (like clicking on "Read the original" to jump to my blog page).
2)URN (N - Name): Now back to the second form of URI, URN, which marks resources in the form of a namespace plus a specific identifier, such as urn:<NAMESPACE-IDENTIFIER>:<NAMESPACE-SPECIFIC-STRING>
. It is not commonly used when we go online, but if you buy a book, you may find a string of characters in the barcode position of each book, such as ISBN xxx-x-xx-xxxx
, which is actually a use of URN.
- DNS: Domain Name System, which is an application layer protocol used for domain name resolution, converting domain names into IP addresses.
Let's first look at the structure of a domain name, still using this image. The red box part is the domain name, which has a hierarchical structure separated by .
, with the rightmost part being the highest level. From right to left, they are the top-level domain, second-level domain, third-level domain, and so on.
Now let's look at the types of DNS and the steps for DNS to resolve domain names, as shown in the following image:
1)DNS Types. DNS is divided into root DNS, top-level DNS, authoritative DNS, and non-authoritative DNS. There are 13 groups of root DNS distributed globally, which can assign DNS resolution to the corresponding top-level DNS based on the requested top-level domain. The top-level DNS then specifies authoritative DNS based on the second-level domain until the IP corresponding to the domain name is resolved. Some large companies also build their own DNS, known as non-authoritative DNS, which are more widely distributed. Well-known examples include Google's 8.8.8.8, Microsoft's 4.2.2.1, and CloudFlare's 1.1.1.1, etc.
2)Steps for DNS to resolve domain names. The actual resolution process is divided into four steps: the system first looks for DNS cache, which may be in the browser or the system; if not found, it checks the hosts file, which contains our custom domain-IP mapping rules. The hosts file path on Mac is /etc/hosts
; if no match is found, it queries the non-authoritative DNS, which generally defaults to the one specified by our network operator; if still unresolved, it must go through the root DNS resolution process~
💡 Here are some common commands related to domain name resolution (dig, host, nslookup
). If you're interested, you can try them in the terminal~
1. DNS addressing process: dig www.baidu.com +trace @8.8.8.8
2. domain name <=> IP: host www.baidu.com
3. domain => IP: nslookup www.baidu.com
If you know how to use WireShark, you can filter out DNS resolution-related packets using filter: port 53
.
- Proxy: This refers to proxies. Proxies are generally divided into forward proxies and reverse proxies. Forward proxies are closer to the client, while reverse proxies are closer to the server. The CDN mentioned earlier is a reverse proxy, while the VPN we use to access external networks is a forward proxy.
HTTP Message#
After all this groundwork (which is indeed worthwhile), we finally arrive at the most important part of HTTP!
The so-called HTTP, Hypertext Transfer Protocol, has its most important part in the last word "protocol," which stipulates the format and usage of HTTP messages.
Basic Format#
Let's first look at the basic format of HTTP messages, which can be simply divided into header and body:
1)Header: Generally includes the start line part, which consists of the Start line and Header. The following image shows the structure of the request header and response header in a request:
-
Request Header
-
The Start line consists of request method, URI, HTTP version, space separator, and the final newline character.
-
The Header consists of individual
key:value
pairs and the final newline character. Note that there should not be extra spaces before the:
; if you don’t believe it, try using thetelnet
command (you can installtelnet
on Mac usingbrew install telnet
, and it is recommended to use it in conjunction with the practical repository provided by Geek Time chronolaw/http_study).
-
-
Response Header
-
The Start line consists of HTTP version, status code, status code explanation, space separator, and the final newline character.
-
The Header structure is the same as that of the request header.
-
2)Body: Generally, the specific content of the body is agreed upon based on the business; it is optional.
Next, let's take a look at the specific request methods and status codes in the request line~
Request Methods#
HTTP/1.1 specifies eight request methods, which can be divided into commonly used and less commonly used categories, with another category being extended request methods. Note that these methods must be in uppercase.
Here are the commonly used request methods:
-
GET and HEAD, used to retrieve resources from the server. The difference between the two is that HEAD only retrieves header information, while GET retrieves the complete header and body information. So if you just want to confirm whether a resource exists or only need header information, you can use a HEAD request to reduce the transmission volume.
-
POST and PUT, used to send resources to the server. The difference between the two is that the former creates a resource on the server, similar to the CREATE operation in a database, while the latter modifies a resource on the server, similar to the UPDATE operation. The two are quite similar, and PUT is used less frequently in practical applications.
💡 Speaking of request methods, two concepts are often mentioned: safety and idempotence.
-
Safety: Refers to not making substantial modifications to server resources, so the aforementioned GET and HEAD are safe.
-
Idempotence: Refers to whether the result remains the same after executing the same operation multiple times, so the aforementioned GET, HEAD, and PUT are all idempotent.
Response Status Codes (5 Categories)#
Now we arrive at our first goal: 🔍 Quickly locate HTTP issues through status codes.
Status codes are generally divided into 5 major categories:
1xx: Informational. Generally represents an intermediate state of a request, which is relatively rare.
2xx: Success. This indicates that the request is as expected, and it is what we most want to see.
3xx: Redirection. The resource has changed, and the client needs to resend the request to another domain.
4xx: Client error. If you see this, think about whether the request message is filled out incorrectly.
5xx: Server error. If you see this, you need to confirm the problem with the server-side colleagues.
Common specific status codes can refer to the following image:
-
301: Permanent redirection, you can change the requested URL.
-
302: Temporary redirection.
-
304: The server resource has not changed, so it redirected to the local cache.
-
401: Unauthorized error, generally related to authentication and login.
-
403: Access denied, possibly accessing sensitive information.
-
404: Resource not found, possibly due to a wrong resource path or lack of permission to access (the error code is custom set by the server; 404 is generally used to indicate that the resource is not found, but it can also extend its use to obscure the specific reason).
-
405: Request method not allowed.
-
502: Error code returned by the gateway or proxy, generally indicating an error accessing the server behind the gateway or proxy.
-
503: Server temporarily unavailable, please try again later.
💡 For 400 and 500, they are relatively vague error codes, sometimes returned as fallback error codes indicating an unknown error has occurred; other times, it is because the server does not want to expose too many details. In any case, the server can customize status codes as long as it adheres to public understanding as much as possible.
Common Header Fields (8 Types)#
Quickly, we arrive at the second goal of this article: 🥣 Familiarize with common header fields in HTTP messages.
First, we divide header fields into three main categories: Request, Response, and Universal. The Universal category includes the Entity subclass. Request header fields are used by the requester, Response header fields are used by the responder, and Universal header fields can be used by both the requester and responder; Entity header fields are generally used to describe body attributes.
The image above lists common header fields. It may look overwhelming, but don’t worry; we will explain them by function. Note: Header fields with the same fill color are generally used together or are related.
Next, we will explain the header fields mainly divided into 8 types by function.
1)Body: Related to body attributes, which can describe those in the request message or those in the response message.
When it comes to body types, we first need to understand what MIME (Multipurpose Internet Mail Extensions) types are. They originated from the email system and are now used to describe body types. Here is a summary of MIME types, and if you click the link, you will find them familiar, such as application/json
, text/html
, text/javascript
, etc. The first half represents a broad category, while the second half represents a specific format.
-
Accept
indicates the body types that the requester can accept, which may include more than one. -
Content-Type
indicates the actual body type being transmitted.
To reduce the size of the body, we generally compress it, and common compression formats include gzip, deflate, and br, which work well for text.
-
Accept-Encoding
indicates the compression formats that the requester can support, which may include more than one. -
Content-Encoding
indicates the actual compression format used for the transmitted body.
Since the above compression methods generally work well only for text, for image, audio, video, and other multimedia formats that do not compress well, there is another way to solve the problem of large files, which is chunked transfer.
Transfer-Encoding: chunked
: This indicates that the data is being transferred in chunks.
For video-type bodies, such as when we watch videos on Bilibili, this video cannot be requested all at once; we can request the video in segments.
-
Accept-Ranges: bytes
: We generally first use a HEAD request to ask the server whether it supports range requests. If it supports byte range requests, the server will return this. -
Range: bytes=x-y
: Under the server's support, the requester can specify the content they want to request from byte x to y. -
Content-Range: bytes x-y/length
: This indicates that the body returned by the server is the content from byte x to y, with a total length of length.
In terms of internationalization, we can also set language requirements.
-
Accept-Language
indicates the languages that the requester can understand, which may include more than one. -
Content-Language
indicates the actual language of the transmitted body.
Here’s a specific example: Accept-Language: en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7
. There are two details we need to pay attention to:
-
In the HTTP specification, the priority of
,
is greater than that of;
, which is contrary to the syntax of most programming languages. Therefore, theen;q=0.9
above is a pair. -
What is the
q
above? It actually represents a weight, with a default of 1. The responder will try to return content in the language with the highest weight.
2)Connection: Related to long connections.
Before HTTP/1.1, the client needed to establish a new connection every time it communicated with the server. If communication occurred frequently, it would repeatedly establish and close TCP connections, as shown on the left side of the image, which is a short connection:
So if we could keep a TCP connection a bit longer, we could communicate multiple times during one connection, which is shown on the right side of the image, i.e., a long connection. HTTP/1.1 supports this.
-
Connection: keep-alive
: This indicates the use of a long connection, which is enabled by default in HTTP/1.1. -
Connection: close
: Actively closes the long connection, generally initiated by the client.
For the server, it can also set the timing for closing long connections, which is configured in the web server. For example, in Nginx, keepalive_timeout
represents the timeout for long connections, which will actively disconnect if there is no data sent or received for a long time; keepalive_requests
represents the maximum number of requests that can be received during the long connection.
With long connections, the client can also initiate multiple requests simultaneously without waiting for the result of the first request to send the second request, which is called pipelined communication.
However, whether short connections or long connections, there will still be a head-of-line blocking (HoL blocking) issue, which is caused by the "request-response" model of HTTP requiring that messages must be "one sent, one received," as can be seen in the following image:
In any case, the receiver must finish processing the red line request before it can handle subsequent requests, even if the latter requests arrive first, meaning "the first sent must be processed first."
To alleviate this problem, one solution is concurrent connections, which means initiating multiple long connections to the same domain, with each long connection being independent. However, maintaining long connections consumes server resources and may also be subject to malicious attacks, so it is generally stipulated that the upper limit for long connections is 6 to 8. If that is not enough, there is a clever way to achieve domain sharding; if the same server has multiple domains, then the upper limit can be doubled.
3)Redirection: Related to redirection.
When discussing status codes, we mentioned 301 (permanent) and 302 (temporary). I wonder if you still remember their meanings. If such status codes are returned, the response header will definitely indicate the location of the redirection.
Location
is the location of the redirection, which generally has two forms: absolute path and relative path. The absolute path corresponds to the basic format of the URL, while the relative path does not include scheme and host, defaulting to the URL in the original request.
There are also three status codes related to redirection: 303 is similar to 302, but the request method can only be GET; 307 and 308 are similar to 302 and 301 respectively (here it is reversed...), but they do not allow any changes to the request after redirection.
4)Cookie: Solving the problems caused by HTTP's stateless nature.
First, we need to clarify whether the statelessness refers to the client or the server? That's right, it refers to the server, meaning the server does not know the relationship between the current request and the previous request. This can complicate the server's handling of certain special scenarios, such as shopping.
So here, cookies are used to solve this problem. In simple terms, it is a small note given by the server to the client, marking the identity of a certain client. This client brings this note with them in each request, proving their identity.
-
Set-Cookie: a=xxx
,Set-Cookie: b=yyy
: This is returned by the server; a cookie is essentially a key-value pair, and each cookie is separate. -
Cookie: a=xxx; b=yyy
: This is what the client brings when sending a request, which are the cookies returned by the server, combined together.
Note that after the client receives these cookies, it will save them on the client side, and we can see them in the Chrome browser.
Oh? Besides Name and Value, how come there are so many attributes for cookies? In fact, cookies returned by the server generally look like this: Set-Cookie: a=xxx; Domain=xx; Path=xx; Max-Age=xx; Expires=xx; HttpOnly; Secure; SameSite=xx...
.
-
Domain, Path: The cookie will only be sent if the URL requested by the client matches them.
-
Max-Age, Expires: Represent the expiration time of the cookie; the Cache also has similar attributes, and it is important to note that Max-Age takes precedence over Expires.
-
HttpOnly: When true, it means this cookie can only be transmitted via HTTP(S) protocol, prohibiting access through other means, such as it can no longer be accessed using document.cookie in JS to prevent script attacks.
-
Secure: When true, it means this cookie will only be sent when making secure HTTPS requests.
-
SameSite=xxx: Setting SameSite=Strict can strictly limit that this cookie cannot be sent cross-site; SameSite=Lax is a bit more lenient, allowing this cookie to be used in safe requests like GET/HEAD.
5)Cache: Related to caching.
Caching is truly ubiquitous, and HTTP requests are no exception. The caching mentioned here is stored on the client side, aimed at minimizing network requests or the size of returned data to improve network transmission efficiency.
-
Cache-Control
-
The attributes that the server can return include:
max-age=10
/no-store
/no-cache
/must-revalidate
.-
The unit of max-age is seconds, starting from the moment of return;
-
no-store means the client is not allowed to cache;
-
no-cache means the client must verify with the server before using the cache;
-
must-revalidate means the cache must be verified after it expires.
-
-
The attributes that the client can send include:
max-age=0
;no-cache
.-
Generally, cmd + R to refresh the page will carry max-age=0, meaning that data that has existed for 0 seconds will not use local cache but will request a newly generated message from the server; cmd + shift + R to force refresh the page will carry no-cache, which is basically the same, depending on how the server handles it.
-
So when will the cache take effect? Generally, when the client initiates requests during browser forward, backward, or redirection, it will not carry the above two attributes.
-
-
In addition, to increase the flexibility of cache control, there are also some conditional fields~
-
The server returns:
-
Last-Modified
represents the last modification time of the file. -
ETag
, which stands for Entity Tag, represents the unique identifier of the resource. It is to solve the problem that modification time cannot accurately distinguish file changes. For example, a file may be modified multiple times within a second, while the minimum unit of modification time is seconds; or a file may modify its time attribute but not change its content. ETag is also divided into strong ETag and weak ETag:-
The condition for the former to remain unchanged is that the resource is unchanged at the byte level.
-
The condition for the latter to remain unchanged is that the resource is unchanged semantically, such as adding a few spaces. Additionally, the value of weak ETag is prefixed with
W/
.
-
-
-
The corresponding client requests:
-
If-Modified-Since
contains the Last-Modified returned by the server during the last request; if the server resource has not been updated since that time, it will return 304, indicating that the client can use the cache. -
If-None-Match
contains the ETag returned by the server during the last request; if the ETag of the server resource has not changed, it will also return 304.
-
6)Proxy: Related to proxies.
Proxies have dual identities because, from the client's perspective, it is the server, while from the server's perspective, it is the client.
As mentioned earlier, proxies are generally divided into forward proxies and reverse proxies. Reverse proxies are generally used for load balancing (reasonably distributing tasks, deciding which server behind will respond to the request), security protection, encryption offloading (not encrypting communication within the internal network to reduce encryption and decryption costs), content caching (temporarily storing server responses, which will be discussed later, i.e., proxy caching), etc.
In the scenario with proxy servers above, the header fields involved are:
Via
: The proxy server will append its hostname and port information to the end of this field when sending a request.
However, the server generally needs to know the real IP information of the client to facilitate access control, user profiling, statistical analysis, etc., so outside the HTTP standard, the following header fields are also specified:
-
X-Forwarded-For
: Similar to the Via appending method, but the appended content is the requester's IP address. -
X-Real-IP
: Only records the client's IP address, which is a bit simpler.
However, the above methods have a significant drawback: performance loss! Because each time the proxy server needs to parse the HTTP message header and modify the message data; moreover, in some cases, the message is not allowed or cannot (be encrypted) to be modified. Therefore, a dedicated proxy protocol was later introduced, which is also specified outside the standard.
Based on this protocol, the proxy server only needs to add a line of text before the HTTP message. For example:
PROXY TCP4 1.1.1.1 2.2.2.2 55555 80\r\n
GET / HTTP/1.1\r\n
Host: www.xxx.com\r\n
\r\n
-
The beginning is the five uppercase letters
PROXY
; -
Then is the type of the client's IP address, such as
TCP4
orTCP6
; -
Following are the requester's and responder's addresses, as well as their port numbers;
-
Finally, it ends with a carriage return and newline.
7)Proxy Cache: Related to proxy caching.
Clients can cache, and intermediary proxy servers can also cache. However, due to the dual identity of proxies, Cache-Control
adds some customized attributes for proxy caching~
-
From the server to the proxy server:
private
indicates that the data can only be saved on the client and cannot be cached on the proxy for sharing with others, such as private user data.public
indicates that the data is completely open and can be cached by anyone.s-maxage
indicates the lifespan of the cache on the proxy server.no-transform
indicates that the proxy server is prohibited from performing any transformation operations on the data, as some proxies may preemptively convert the data format for easier processing of subsequent requests.
-
From the client to the proxy server:
-
max-stale
indicates acceptance of cached data that has expired for a period of time. -
min-fresh
is the opposite of the above, indicating that the cache must still have a period of validity. -
only-if-cached
indicates that the client only accepts proxy cache. If there is no matching cache on the proxy, the client does not want the proxy to request the server again.
-
8)Others
Let's take another look at this common header field image. Are you clear about the meaning and usage of each header field now?
Wait, there are still some header field explanations missing above; I will summarize them here:
-
Host
represents the hostname to be requested. It must appear in HTTP/1.1 to help the server distinguish which specific host to handle the request (if multiple virtual hosts are hosted on the computer, this serves that purpose; otherwise, the server generally will not process it). Additionally, in general network frameworks, it will help us parse a default Host value from the URL as a fallback, so you may not have to fill it in manually, as the framework will automatically supplement it for you. -
User-Agent
is the user agent, used to describe the identity of the requester, and the server can return an appropriate page layout or data based on it. However, due to historical reasons, its usage has become somewhat chaotic, as each browser claims to be "Mozilla Chrome Safari," etc. -
Date
represents the time the message was created, generally appearing in the response header. -
Server
displays the name and version number of the software providing web services, but exposing part of the server's information may pose security risks, so sometimes this field is omitted in the response or only a vague piece of information is provided. -
Content-Length
represents the length of the body in the message. If this field is absent, there will generally be another fieldTransfer-Encoding: chunked
, which we mentioned earlier.
Thus, we have completed the explanation of what HTTP is. You can try using Chrome Developer Tools or WireShark to deepen your understanding.
What is HTTPS?#
HTTPS adds an S to HTTP, which represents the SSL/TLS protocol.
Now we arrive at our third goal: 🔐 Understand basic encryption knowledge.
In this section, since I have previously written related articles, I will try to reduce the length; you can refer to the links below:
-
Information Security | How to Establish Trust in the Internet Age?: Three common cryptographic algorithms, digital signatures, digital certificates.
-
Information Security | (Supplement) How to Establish Trust in the Internet Age?: SSL/TLS, SSH, iOS signing, OpenSSL, WireShark practice.
One additional point to note is the mainstream handshake method based on ECDHE of TLS vs. the traditional handshake method based on RSA.
The key difference between the two lies in how the third random number Pre-Master
is generated during the communication key generation process:
-
The former: Both sides randomly generate public and private keys first, and then the public key (with signature) is sent to the other party as a parameter. Both sides then use each other's parameters to generate
Pre-Master
using the ECDHE algorithm; -
The latter: The client directly generates the random number
Pre-Master
, encrypts it with the public key of the server certificate, and sends it to the server.
Because the public and private keys of the former are randomly generated, even if a private key is leaked or cracked in a certain communication process, it only affects that one communication; while the public and private keys of the latter are fixed, if the private key is leaked or cracked, all previous encrypted communication records will be compromised, as patient hackers have been collecting messages for a long time, waiting for this day (it is said that the Snowden Prism incident utilized this point).
In other words, the former has "one-time keys," providing forward secrecy; while the latter has the risk of "today's interception, tomorrow's decryption," lacking forward secrecy.
For more details, you can refer to the lesson on TLS 1.2 Connection Process Analysis in "Understanding the HTTP Protocol," or try capturing packets with WireShark yourself~
From the perspective of packet capture, the main differences between the two are:
-
The former has an additional "Server Key Exchange" message compared to the latter.
-
The former allows the client to start encrypted communication before the connection is fully established, meaning the client does not have to wait for the server to send back the "Finished" confirmation to complete the handshake, which is called "TLS False Start."
The Development of HTTP#
Let's summarize the development process of HTTP through the table below; today we will have an overall understanding of HTTP's development~
Time | Version | Main change | Note |
---|---|---|---|
1989 | 3 key technologies | HTML, URI, HTTP | Paper from Tim Berners-Lee. |
1991 | HTTP/0.9 | 1. Request way: GET. 2. Data: HTML. | No RFC. |
1996 | HTTP/1.0 | 1. +Request way: HEAD, POST. 2. +Data: img, audio. 3. +Other: HTTP Head, status code, protocol version. | RFC-1945 (1996). Not a formal standard. |
1999 | HTTP/1.1 | 1. +Request way: PUT, DELETE. 2. +Cache-control. 3. +Keep-Alive. 4. +Pipeline transmission(Content-Length), chunked transmission. 5. +Host head (Required). | +Google, Sina, Sohu, Netease, Tencent. RFC-2616 (1999). +Facebook, Twitter, Taobao, JD. Divided to RFC-7230~7235 (2014). RFC-9112 (2022). |
2015 | HTTP/2 | 1. Transmission data format: text → binary data. 2. +Concurrent requests (use stream, abandon pipeline transmission). 3. +Header Compression. 4. +Allow the server to push. 5. +Combined with TLS 1.2+. | Based on SPDY in Chrome browser (2009). RFC-7540 (2015). RFC-9113 (2022). |
2022 | HTTP/3 | 1. Transport layer protocol: TCP → QUIC (based on UDP, including TLS 1.3, IP → connection ID). 2. Header Compression: HPACK→QPACK | Based on QUIC in Chrome browser(2012). RFC-9114 (2022). |
-
Since HTTP/1.0, HTTP has been written into RFC documents (RFC document summary).
-
HTTP/1.1 is the first formal standard of HTTP, and most of the functions were introduced in the common header fields section. During this early stage, companies like Google, Sina, Sohu, Netease, and Tencent were founded, and later Facebook, Twitter, Taobao, JD, and others gradually emerged.
-
HTTP/1.1 was relatively complete in various aspects, but there was still significant room for optimization in performance and security. Therefore, HTTP/2 and HTTP/3 mainly optimized the performance of HTTP.
- HTTP/2 is based on the SPDY protocol of Chrome, which was promoted by Chrome. The main changes include:
-
The data transmission format changed from text to binary, greatly facilitating computer parsing.
-
Based on the concept of virtual streams, it achieved multiplexing capability, replacing the pipelining function in HTTP/1.1.
-
Utilized the HPACK algorithm for header compression, which previously only applied to the body.
-
Allowed the server to proactively push messages by creating new "streams." For example, when the browser requests HTML, it can proactively send the potentially needed JS and CSS files to the client.
-
In terms of security, some enhancements were also made. The encrypted version of HTTP/2 stipulates that its underlying communication protocol must be above TLS 1.2 (as previous versions had many vulnerabilities), requiring support for forward secrecy and SNI (Server Name Indication, an extension protocol of TLS, where the client informs the server of the hostname it is connecting to at the start of the handshake process), and listing hundreds of weak cipher suites on a "blacklist."
-
PS: Compared to the concurrent connection method in HTTP/1.1, the concept of virtual streams more elegantly solves the head-of-line blocking problem in HTTP.
- HTTP/3 is based on the QUIC protocol of Chrome, which is also promoted by Chrome.
-
First, let's look at QUIC:
-
It implements reliable transmission based on UDP and introduces a stream concept similar to HTTP/2.
-
It includes TLS 1.3, speeding up the connection establishment.
-
Connections use "opaque" connection IDs to mark both ends, rather than being bound to IP addresses and ports, thus supporting seamless connection migration for users.
-
-
Returning to HTTP/3:
-
Its biggest change is replacing the underlying transport layer protocol from TCP to QUIC, completely solving the head-of-line blocking problem of TCP (note that it is TCP's, not HTTP's), performing better in weak network environments. Since QUIC itself already supports encryption, streams, and multiplexing capabilities, the workload for HTTP/3 is significantly reduced.
-
The header compression algorithm has been upgraded from HPACK to QPACK.
-
On June 6, 2022, HTTP/3 was officially written into the RFC document, and HTTP/1.1 and HTTP/2 also updated their RFC documents.
-
-
PS: TCP has a special "packet loss retransmission" mechanism to ensure reliable transmission, meaning that lost packets must wait for retransmission confirmation, and other packets, even if received, can only be placed in the buffer (kernel), and the upper-layer application (user) cannot access them. This can be seen in the following image: the red square request is the key to blocking TCP.
(Actually, there is a bit of confusion here: does this mean that the blocking issues solved by HTTP/3 before were only those from kernel to user? Specifically, it refers to the blocking issues after TCP; what blocking issues exist at this stage? 🤔? Experts are welcome to clarify~)
The Development of HTTPS#
This section discusses the development from TLS 1.2 to TLS 1.3. Previous versions have been deprecated due to various security issues, which can be learned from the article Information Security | (Supplement) How to Establish Trust in the Internet Age?.
For TLS 1.3, its main optimization goals include three:
-
Compatibility with TLS 1.2. To ensure that older devices can upgrade the protocol more easily, TLS 1.3 maintains the original record format and uses extension protocols to add some "extension fields" at the end of the original records to increase new functionalities. Older versions of TLS can directly ignore them, achieving "backward compatibility."
-
More secure. TLS 1.3 has streamlined the supported cipher suites for security reasons, leaving only five cipher suites. The traditional handshake method based on RSA mentioned earlier has been abolished.
-
Higher performance. The process of establishing a connection for HTTPS includes both TCP handshake and TLS handshake. In TLS 1.2, the TLS handshake takes 2-RTT, while in TLS 1.3, this time has been optimized to 1-RTT. How is this achieved?
-
The answer lies in the previous point: because there are only so many cipher suites, TLS 1.3 can include all supported cipher suite parameters in the ClientHello message, allowing the server to select one and directly generate the communication key for encrypted communication! The client also avoids the process of waiting for the server to confirm the cipher suite before sending parameters, which was required in TLS 1.2.
-
Besides the standard 1-RTT handshake, TLS 1.3 can achieve 0-RTT handshake if it has previously established a connection and cached server cipher suite parameters. However, this also poses risks of forward secrecy and replay attacks, so users need to weigh the pros and cons.
-
Below, I will also include a comparison diagram of the communication processes from "Understanding the HTTP Protocol," so you can see their differences.
-
Epilogue (Good News)#
Alright, that's all for today. Let's return to our goals and see if you can think of specific knowledge?
-
🔍 Quickly locate HTTP issues.
-
🥣 Familiarize with common header fields in HTTP messages.
-
🔐 Understand basic encryption knowledge.
Of course, don't forget our ultimate goal🏁: If you are interested in HTTP, try to independently delve into HTTP using WireShark, Chrome, Telnet tools, and RFC documents~
This is Bo2SS, see you next time!
Breaking news: After more than a year of casual operation, on August 14, 2022, at 10:45, the fan base of Bo2SS quietly surpassed 500🎉! Everyone, come and help think about how to celebrate (or take advantage of) Bo2SS? Feel free to vote or leave a message below!
Vote:
A. Send red envelopes for good luck
B. Give away books to gain knowledge
C. Create a group chat to promote friendship
D. Stop playing and continue writing articles
E. All of the above, don’t miss out
F. None of the above, I’ll leave a message instead