History log of /src/sys/netinet/tcp_usrreq.c (Results 1 – 25 of 1686)
Revision Date Author Comments
# 7b71f57f 03-Dec-2025 Warner Losh <imp@FreeBSD.org>

netinet: Remove left-over sys/cdefs.h

These were for $FreeBSD$ that was removed a while ago, but these
includes didn't get swept up in that. Remove them all now.

Sponsored by: Netflix
MFC After:

netinet: Remove left-over sys/cdefs.h

These were for $FreeBSD$ that was removed a while ago, but these
includes didn't get swept up in that. Remove them all now.

Sponsored by: Netflix
MFC After: 2 weeks

show more ...


# dd0e6bb9 22-Nov-2025 Andrew Gallatin <gallatin@FreeBSD.org>

tcp: Enable symmetric hashing by setting hash on outgoing conns

Now that we can trust NICs to supply an identical hash result
to software, we can setup the inpcb hash on outgoing connections.
This g

tcp: Enable symmetric hashing by setting hash on outgoing conns

Now that we can trust NICs to supply an identical hash result
to software, we can setup the inpcb hash on outgoing connections.
This gives us symmetric hashing, meaning packets should enter
and leave on the same NIC queue.

Differential Revision: https://reviews.freebsd.org/D53104
Reviewed by: adrian, cc, kbowling, tuexen, zlei
Sponsored by: Netflix

show more ...


# 8e8956f7 02-Nov-2025 Michael Tuexen <tuexen@FreeBSD.org>

ddb: use %b when showing flags for a tcpcb

This is much more compact. Thanks to markj@ for suggesting the change.

Reviewed by: markj, Peter Lei, imp, Nick Banks
MFC after: 3 days
Sponsored by: N

ddb: use %b when showing flags for a tcpcb

This is much more compact. Thanks to markj@ for suggesting the change.

Reviewed by: markj, Peter Lei, imp, Nick Banks
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53510

show more ...


# 9aa5a79e 31-Oct-2025 Michael Tuexen <tuexen@FreeBSD.org>

ddb: optionally print inp when printing tcpcb

Add /i option to the ddb commands show tcpcb and show all tcpcbs,
which enables the printing of the t_inpcb.

Reviewed by: markj
MFC after: 3 days
Spo

ddb: optionally print inp when printing tcpcb

Add /i option to the ddb commands show tcpcb and show all tcpcbs,
which enables the printing of the t_inpcb.

Reviewed by: markj
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53497

show more ...


# f2c2ed7d 25-Jul-2025 Gleb Smirnoff <glebius@FreeBSD.org>

sendfile: don't hack sb_lowat for sockets that manage the watermark

In the sendfile(2) we carry an old hack (originating from d99b0dd2c5297)
to help dumb benchmarks and applications to achieve highe

sendfile: don't hack sb_lowat for sockets that manage the watermark

In the sendfile(2) we carry an old hack (originating from d99b0dd2c5297)
to help dumb benchmarks and applications to achieve higher performance. We
would modify low watermark on the socket send buffer to avoid socket being
reported as writable too early, which would result in lots of small
writes.

Skip that hack for applications that do setsockopt(SO_SNDLOWAT) or that
register the socket in kevent(2) with NOTE_LOWAT feature. First, we don't
want the hack to rewrite the watermark value explicitly specified by the
user. Second, in certain cases that can lead to real performance
regressions. A kevent(2) with NOTE_LOWAT would report socket as writable,
but then sendfile(2) would write 0 bytes and return EAGAIN.

The change also disables the hack for unix(4) sockets, leaving only TCP.

Reviewed by: rrs
Differential Revision: https://reviews.freebsd.org/D50581

show more ...


# 15c991fd 24-Jul-2025 Nick Banks <nickbanks@netflix.com>

tcp: remove trailing whitespaces

Reviewed by: cc, tuexen, Peter Lei
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51437


# 96f544bc 07-Jul-2025 Michael Tuexen <tuexen@FreeBSD.org>

tcp: don't allow to connect a TCP/IPv6 endpoint in TIME WAIT state

This ensures the TCP/IPv4 and TCP/IPv6 behave the same.

Reported by: syzbot+4de353ba85dac4dcb1ab@syzkaller.appspotmail.com
Review

tcp: don't allow to connect a TCP/IPv6 endpoint in TIME WAIT state

This ensures the TCP/IPv4 and TCP/IPv6 behave the same.

Reported by: syzbot+4de353ba85dac4dcb1ab@syzkaller.appspotmail.com
Reviewed by: Peter Lei
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51125

show more ...


# ba3d5479 17-Jun-2025 Mark Johnston <markj@FreeBSD.org>

tcp: Fix the SO_REUSEPORT_LB check

This needs to happen in tcp_connect() rather than tcp_usr_connect(), as
the latter is reachable by implied connect() via sendto().

Reviewed by: glebius
Reported b

tcp: Fix the SO_REUSEPORT_LB check

This needs to happen in tcp_connect() rather than tcp_usr_connect(), as
the latter is reachable by implied connect() via sendto().

Reviewed by: glebius
Reported by: syzbot+eecc86e6952fd9ba9f11@syzkaller.appspotmail.com
Fixes: c7f803c71dae ("inpcb: fix a panic with SO_REUSEPORT_LB + connect(2) misuse")
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D50893

show more ...


# 0dc78204 10-Jun-2025 Michael Tuexen <tuexen@FreeBSD.org>

ddb: fix handling of BBLog entries when BBLog is disabled

Fixes: a62c6b0de48a ("ddb: add optional printing of BBLog entries")
MFC after: 1 week
Sponsored by: Netflix, Inc.


# a62c6b0d 10-Jun-2025 Michael Tuexen <tuexen@FreeBSD.org>

ddb: add optional printing of BBLog entries

Add a /b option to show tcpcb and show all tcpcbs to print BBLog
entries. Right now this supports the entries generated by the
FreeBSD default TCP stack.

ddb: add optional printing of BBLog entries

Add a /b option to show tcpcb and show all tcpcbs to print BBLog
entries. Right now this supports the entries generated by the
FreeBSD default TCP stack. It should help in debugging issues
reported by syzkaller.
The syntax for printing sent and received packets is similar to the
one used by packetdrill, since the output of ddb will be used to
create packetdrill scripts for debugging.

Reviewed by: thj
Tested by: thj
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D50629

show more ...


# f1430567 28-May-2025 Michael Tuexen <tuexen@FreeBSD.org>

ddb: add show all tcpcbs

Add a command to show all TCP control blocks. Also provide an option
to limit the output to TCP control blocks, which are locked.
The plan is to run show all tcpcbs/l when s

ddb: add show all tcpcbs

Add a command to show all TCP control blocks. Also provide an option
to limit the output to TCP control blocks, which are locked.
The plan is to run show all tcpcbs/l when syzkaller triggers a panic.
If a TCP control block is affected, it is most likely locked and
therefore the command shows the information of the affected TCP
control block.

Reviewed by: markj, thj
Tested by: thj
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D50516

show more ...


# 8d4f495d 28-May-2025 Michael Tuexen <tuexen@FreeBSD.org>

ddb: improve show tcpcb

Print the name of the TCP function block and the name of the
congestion control algorithm. Furthermore, print some information
related to Black Box Logging.

Reviewed by: th

ddb: improve show tcpcb

Print the name of the TCP function block and the name of the
congestion control algorithm. Furthermore, print some information
related to Black Box Logging.

Reviewed by: thj
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D50535

show more ...


# a0da2f73 08-May-2025 Dag-Erling Smørgrav <des@FreeBSD.org>

Remove remaining mentions of pr_usrreq.

When struct pr_usrreq was folded into struct protosw and the function
pointers it contained were renamed from pru_* to pr_* in 2022, a
number of references to

Remove remaining mentions of pr_usrreq.

When struct pr_usrreq was folded into struct protosw and the function
pointers it contained were renamed from pru_* to pr_* in 2022, a
number of references to the old names in comments and error messages
were missed. Chase them down and fix them.

Sponsored by: Klara, Inc.
Sponsored by: NetApp, Inc.
Reviewed by: kevans, glebius
Differential Revision: https://reviews.freebsd.org/D50190

show more ...


# a35f24c9 30-Apr-2025 Gleb Smirnoff <glebius@FreeBSD.org>

sendfile: factor out socket send buffer space sensing into a method

Move a block of code that works with the socket send buffer from the main
sendfile loop into a separate function. Make it a proto

sendfile: factor out socket send buffer space sensing into a method

Move a block of code that works with the socket send buffer from the main
sendfile loop into a separate function. Make it a protocol method, so
that protocols may provide a different one.

While here, provide a long comment explaining why we modify sb_lowat and
why we can't just remove that hack.

No functional change intended.

Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D48918

show more ...


# 6e764890 31-Mar-2025 Michael Tuexen <tuexen@FreeBSD.org>

tcp: remove support for TCPPCAP

This feature could be used to store the last sent and received TCP
packets for a TCP endpoint. There was no utility to get these packets
from a live system or core.
T

tcp: remove support for TCPPCAP

This feature could be used to store the last sent and received TCP
packets for a TCP endpoint. There was no utility to get these packets
from a live system or core.
This functionality is now provided by TCP Black Box Logging, which also
stores additional events. There are tools to get these traces from a
live system or a core.
Therefore remove TCPPCAP to avoid maintaining it, when it is not
used anymore.

Reviewed by: rrs, rscheff, Peter Lei, glebiu
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D49589

show more ...


# c7f803c7 07-Mar-2025 Gleb Smirnoff <glebius@FreeBSD.org>

inpcb: fix a panic with SO_REUSEPORT_LB + connect(2) misuse

This combination doesn't make any sense. This socket option makes sense
only on a socket that is going to be a listening one. There are

inpcb: fix a panic with SO_REUSEPORT_LB + connect(2) misuse

This combination doesn't make any sense. This socket option makes sense
only on a socket that is going to be a listening one. There are two
options here: refuse connect(2) on a socket that has the option set
previously, or ignore (and clear) the option. After some discussion on
phabricator, we have chosen the former, for safety and consistency
reasons. Any programmer that runs this sequence is doing something wrong
and should be informed of that with appropriate error code.

Since connect(2) is a SUS API that has a defined set of error codes, none
of which corresponds to "a socket has non-standard incompatible socket
option set", we decided to return the same error that an already listening
socket would return.

Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D49150

show more ...


# e92a78ad 07-Mar-2025 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: return EOPNOTSUPP on attempt to connect(2) a listening socket

This is the error code specified by SUS. Only the TCP over IPv6 required
this fix.

Fixes: bd4a39cc93d9faf8b5c000855d5aa90df592d

tcp: return EOPNOTSUPP on attempt to connect(2) a listening socket

This is the error code specified by SUS. Only the TCP over IPv6 required
this fix.

Fixes: bd4a39cc93d9faf8b5c000855d5aa90df592dd49
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D49275

show more ...


# 5dc99e9b 06-Feb-2025 Mark Johnston <markj@FreeBSD.org>

tcp: Add a sysctl to modify listening socket FIB inheritance

Introduce the net.inet.tcp.bind_all_fibs tunable, set to 1 by default
for compatibility with current behaviour. When set to 0, all TCP
l

tcp: Add a sysctl to modify listening socket FIB inheritance

Introduce the net.inet.tcp.bind_all_fibs tunable, set to 1 by default
for compatibility with current behaviour. When set to 0, all TCP
listening sockets are private to their FIB. Inbound connection requests
will only succeed if a matching inpcb is bound to the same FIB as the
request.

No functional change intended, as the new behaviour is not enabled by
default.

Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D48663

show more ...


# bbd0084b 06-Feb-2025 Mark Johnston <markj@FreeBSD.org>

inpcb: Add a flags parameter to in_pcbbind()

Add a flag, INPBIND_FIB, which means that the inpcb is local to its FIB
number. When this flag is specified, duplicate bindings are permitted,
so long a

inpcb: Add a flags parameter to in_pcbbind()

Add a flag, INPBIND_FIB, which means that the inpcb is local to its FIB
number. When this flag is specified, duplicate bindings are permitted,
so long as each FIB contains at most one inpcb bound to the same
address/port. If an inpcb is bound with this flag, it'll have the
INP_BOUNDFIB flag set.

No functional change intended.

Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D48661

show more ...


# 06bf119f 28-Jan-2025 Gleb Smirnoff <glebius@FreeBSD.org>

sockets/tcp: quick fix for regression with SO_REUSEPORT_LB

There was a long living problem that pr_listen is called every time on
consecutive listen(2) syscalls. Up until today it produces spurious

sockets/tcp: quick fix for regression with SO_REUSEPORT_LB

There was a long living problem that pr_listen is called every time on
consecutive listen(2) syscalls. Up until today it produces spurious TCP
state change events in tracing software and other harmless problems. But
with 7cbb6b6e28db we started to call LIST_REMOVE() twice on the same
entry.

This is quite ugly, but quick and robust fix against regression, that we
decided to put in the scope of the January stabilization week. A better
refactoring will happen later.

Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D48703
Fixes: 7cbb6b6e28db33095a1cf7a8887921a5ec969824

show more ...


# 7cbb6b6e 23-Jan-2025 Mark Johnston <markj@FreeBSD.org>

inpcb: Close some SO_REUSEPORT_LB races, part 2

Suppose a thread is adds a socket to an existing TCP lbgroup that is
actively accepting connections. It has to do the following operations:
1. set SO

inpcb: Close some SO_REUSEPORT_LB races, part 2

Suppose a thread is adds a socket to an existing TCP lbgroup that is
actively accepting connections. It has to do the following operations:
1. set SO_REUSEPORT_LB on the socket
2. bind() the socket to the shared address/port
3. call listen()

Step 2 makes the inpcb visible to incoming connection requests.
However, at this point the inpcb cannot accept new connections. If
in_pcblookup() matches it, the remote end will see ECONNREFUSED even
when other listening sockets are present in the lbgroup. This means
that dynamically adding inpcbs to an lbgroup (e.g., by starting up new
workers) can trigger spurious connection failures for no good reason.
(A similar problem exists when removing inpcbs from an lbgroup, but that
is harder to fix and is not addressed by this patch; see the review for
a bit more commentary.)

Fix this by augmenting each lbgroup with a linked list of inpcbs that
are pending a listen() call. When adding an inpcb to an lbgroup, keep
the inpcb on this list if listen() hasn't been called, so it is not yet
visible to the lookup path. Then, add a new in_pcblisten() routine which
makes the inpcb visible within the lbgroup now that it's safe to let it
handle new connections.

Add a regression test which verifies that we don't get spurious
connection errors while adding sockets to an LB group.

Reviewed by: glebius
MFC after: 1 month
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D48544

show more ...


# 053a9884 23-Dec-2024 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: don't ever return ECONNRESET on close(2)

The SUS doesn't mention this error code as a possible one [1]. The FreeBSD
manual page specifies a possible ECONNRESET for close(2):

[ECONNRESET] The u

tcp: don't ever return ECONNRESET on close(2)

The SUS doesn't mention this error code as a possible one [1]. The FreeBSD
manual page specifies a possible ECONNRESET for close(2):

[ECONNRESET] The underlying object was a stream socket that was
shut down by the peer before all pending data was
delivered.

In the past it had been EINVAL (see 21367f630d72), and this EINVAL was
added as a safety measure in 623dce13c64ef. After conversion to
ECONNRESET it had been documented in the manual page in 78e3a7fdd51e6, but
I bet wasn't ever tested to actually be ever returned, cause the
tcp-testsuite[2] didn't exist back then. So documentation is incorrect
since 2006, if my bet wins. Anyway, in the modern FreeBSD the condition
described above doesn't end up with ECONNRESET error code from close(2).
The error condition is reported via SO_ERROR socket option, though. This
can be checked using the tcp-testsuite, temporarily disabling the
getsockopt(SO_ERROR) lines using sed command [3]. Most of these
getsockopt(2)s are followed by '+0.00 close(3) = 0', which will confirm
that close(2) doesn't return ECONNRESET even on a socket that has the
error stored, neither it is returned in the case described in the manual
page. The latter case is covered by multiple tests residing in tcp-
testsuite/state-event-engine/rcv-rst-*.

However, the deleted block of code could be entered in a race condition
between close(2) and processing of incoming packet, when connection had
already been half-closed with shutdown(SHUT_WR) and sits in TCPS_LAST_ACK.
This was reported in the bug 146845. With the block deleted, we will
continue into tcp_disconnect() which has proper handling of INP_DROPPED.

The race explanation follows. The connection is in TCPS_LAST_ACK. The
network input thread acquires the tcpcb lock first, sets INP_DROPPED,
acquires the socket lock in soisdisconnected() and clears SS_ISCONNECTED.
Meanwhile, the syscall thread goes through sodisconnect() which checks for
SS_ISCONNECTED locklessly(!). The check passes and the thread blocks on
the tcpcb lock in tcp_usr_disconnect(). Once input thread releases the
lock, the syscall thread observes INP_DROPPED and returns ECONNRESET.

- Thread 1: tcp_do_segment()->tcp_close()->in_pcbdrop(),soisdisconnected()
- Thread 2: sys_close()...->soclose()->sodisconnect()->tcp_usr_disconnect()

Note that the lockless operation in sodisconnect() isn't correct, but
enforcing the socket lock there will not fix the problem.

[1] https://pubs.opengroup.org/onlinepubs/9799919799/
[2] https://github.com/freebsd-net/tcp-testsuite
[3] sed -i "" -Ee '/\+0\.00 getsockopt\(3, SOL_SOCKET, SO_ERROR, \[ECONNRESET\]/d' $(grep -lr ECONNRESET tcp-testsuite)

PR: 146845
Reviewed by: tuexen, rrs, imp
Differential Revision: https://reviews.freebsd.org/D48148

show more ...


# c91dd7a0 19-Dec-2024 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: remove unused variable from tcp_usr_disconnect()


# 0b4539ee 14-Nov-2024 Gleb Smirnoff <glebius@FreeBSD.org>

inpcb: gc unused argument of in_pcbconnect()


# dded4e9e 13-Nov-2024 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: change SOCKBUF_* macros to SOCK_[RECV|SEND]BUF_* macros

Change the older LOCK related macros over to the
dedicated send/recv buffer macros in the base tcp stack.

No functional change intended.

tcp: change SOCKBUF_* macros to SOCK_[RECV|SEND]BUF_* macros

Change the older LOCK related macros over to the
dedicated send/recv buffer macros in the base tcp stack.

No functional change intended.

Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47567

show more ...


12345678910>>...68