[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lwip-users] TCP bandwidth limited by rate of ACKs
From: |
Mason |
Subject: |
Re: [lwip-users] TCP bandwidth limited by rate of ACKs |
Date: |
Wed, 12 Oct 2011 18:01:40 +0200 |
User-agent: |
Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20110928 Firefox/7.0.1 SeaMonkey/2.4.1 |
Kieran Mansley wrote:
> Mason wrote:
>
>> In the UDP test, any frame larger than 1500 bytes was
>> not passed to lwip. The bogus ACK was sent directly
>> from the RX context instead.
>
> OK, I'm not at all surprised that there is a big difference between the
> performance of that test, and the TCP performance you first measured.
>
> In the TCP case using the sockets API, you have:
> 1) 2 copies in each direction (app to/from lwIP, lwIP to/from your
> driver's buffer descriptors)
> 2) all the TCP protocol processing (which is probably not a big deal)
> 3) 2 context switches in each direction (driver to/from lwIP, lwIP
> to/from app)
>
> This is going to put a big dent in the throughput. You can avoid a lot
> of that by using the raw lwIP API. If you have to stick with the
> sockets API you may be stuck with relatively low throughput.
I didn't want to believe Simon when he wrote "From reading the wireshark
capture, it seems that the receiver simply is not fast enough."
So I profiled the system, and, indeed, the 450-MHz CPU is running
full tilt (cf. the profile analysis of the STB receiving 3500 MB
of data over TCP, at the end of this message.)
I need to make sense of the time spent inside _md_kernel_intr_mask,
but I fear I don't have much wiggle-room, aside from using the raw API.
Regards,
Mason
OS21 profile analysis for main.out
(16 bytes per bucket, sampled at 1000 Hz, system wide profile)
515.859 seconds (100.00%) : Total duration
Task breakdown:
218.810 seconds ( 42.42%) : tcpip_thread
144.978 seconds ( 28.10%) : rxapp
121.828 seconds ( 23.62%) : RxTask
0.476 seconds ( 0.09%) : Idle Task
0.350 seconds ( 0.07%) : STLAYER-GFX/CUR
0.120 seconds ( 0.02%) : STLAYER-GFX/CUR
0.092 seconds ( 0.02%) : STDVM_ServiceTa
0.071 seconds ( 0.01%) : STLAYER-GFX/CUR
0.041 seconds ( 0.01%) : phy: link
Interrupt level breakdown:
27.996 seconds ( 5.43%) : Interrupt level 8
0.968 seconds ( 0.19%) : Interrupt level 13
0.085 seconds ( 0.02%) : Interrupt level 15
Symbolic breakdown:
261.467 seconds ( 50.69%) : _md_kernel_intr_mask
38.134 seconds ( 7.39%) : memcpy
14.321 seconds ( 2.78%) : stxmac_dma_isr
13.661 seconds ( 2.65%) : tcp_input
9.797 seconds ( 1.90%) : _md_timer_system_time
7.401 seconds ( 1.43%) : _st40_kernel_interrupt_handler600
7.133 seconds ( 1.38%) : tcp_receive
6.450 seconds ( 1.25%) : _malloc_r
5.746 seconds ( 1.11%) : lwip_recvfrom
5.469 seconds ( 1.06%) : stxmac_dma_rx_start
4.537 seconds ( 0.88%) : _free_r
4.333 seconds ( 0.84%) : mutex_release
4.216 seconds ( 0.82%) : event_callback
4.189 seconds ( 0.81%) : ip_input
3.878 seconds ( 0.75%) : stxmac_dma_tx_start
3.572 seconds ( 0.69%) : _md_kernel_intr_mask_all
3.331 seconds ( 0.65%) : mutex_lock
3.315 seconds ( 0.64%) : __divdi3
3.016 seconds ( 0.58%) : device_ioctl
2.950 seconds ( 0.57%) : _os21_scheduler_lock
2.887 seconds ( 0.56%) : ip_output_if
2.830 seconds ( 0.55%) : _message_q_remove
2.806 seconds ( 0.54%) : __udiv_qrnnd_16
2.799 seconds ( 0.54%) : pbuf_alloc
2.746 seconds ( 0.53%) : sys_timeouts_mbox_fetch
2.701 seconds ( 0.52%) : semaphore_wait
2.700 seconds ( 0.52%) : interrupt_unmask
2.685 seconds ( 0.52%) : semaphore_wait_timeout
2.628 seconds ( 0.51%) : __libc_lock_init_complete_recursive
2.514 seconds ( 0.49%) : sys_arch_mbox_fetch
2.373 seconds ( 0.46%) : osplus_physical_address
2.308 seconds ( 0.45%) : pbuf_header
2.190 seconds ( 0.42%) : _message_q_add
2.137 seconds ( 0.41%) : _os21_task_is_valid
2.121 seconds ( 0.41%) : ethernet_input
2.101 seconds ( 0.41%) : recv_tcp
1.838 seconds ( 0.36%) : ethernet_api_ioctl
1.746 seconds ( 0.34%) : _os21_interrupt_handler
1.697 seconds ( 0.33%) : cache_purge_data
1.688 seconds ( 0.33%) : lwip_standard_chksum
1.654 seconds ( 0.32%) : _st40_cache_purge_data_work_1
1.582 seconds ( 0.31%) : __malloc_lock
1.551 seconds ( 0.30%) : profile_start_all
1.549 seconds ( 0.30%) : pbuf_copy_partial
1.485 seconds ( 0.29%) : __malloc_unlock
1.479 seconds ( 0.29%) : tcp_output
1.450 seconds ( 0.28%) : message_receive_timeout
1.444 seconds ( 0.28%) : ip4_addr_isbroadcast
1.414 seconds ( 0.27%) : ethernet_ops_do_async_tx
1.413 seconds ( 0.27%) : netconn_recv_data
1.335 seconds ( 0.26%) : etharp_output
1.321 seconds ( 0.26%) : do_recv
1.291 seconds ( 0.25%) : tcp_output_alloc_header
1.178 seconds ( 0.23%) : sys_arch_unprotect
1.175 seconds ( 0.23%) : _os21_scheduler_unlock
1.162 seconds ( 0.23%) : inet_chksum_pseudo
1.105 seconds ( 0.21%) : sys_arch_protect
1.090 seconds ( 0.21%) : semaphore_signal
1.086 seconds ( 0.21%) : sys_arch_sem_wait
1.073 seconds ( 0.21%) : sys_mbox_trypost
1.032 seconds ( 0.20%) : _os21_is_active
1.011 seconds ( 0.20%) : locate_handle
1.011 seconds ( 0.20%) : rx_task
0.982 seconds ( 0.19%) : tcp_recved
0.950 seconds ( 0.18%) : release_handle
0.947 seconds ( 0.18%) : __libc_lock_release_recursive
0.934 seconds ( 0.18%) : message_claim_timeout
0.922 seconds ( 0.18%) : tcp_parseopt
0.908 seconds ( 0.18%) : message_send
0.899 seconds ( 0.17%) : claim_handle
0.885 seconds ( 0.17%) : inet_chksum
0.881 seconds ( 0.17%) : ip_output
0.875 seconds ( 0.17%) : interrupt_mask
0.787 seconds ( 0.15%) : __libc_lock_acquire_recursive
0.786 seconds ( 0.15%) : pbuf_cat
0.760 seconds ( 0.15%) : time_now_set
0.728 seconds ( 0.14%) : tcpip_input
0.712 seconds ( 0.14%) : _os21_scheduler_exception_context
0.680 seconds ( 0.13%) : handle_frame
0.660 seconds ( 0.13%) : pbuf_free
0.621 seconds ( 0.12%) : osplus_handle
0.620 seconds ( 0.12%) : raw_input
0.607 seconds ( 0.12%) : icmp_input
0.584 seconds ( 0.11%) : _os21_vmem_util_min_page_size
0.584 seconds ( 0.11%) : message_release
0.568 seconds ( 0.11%) : ethernet_ops_do_async_rx
0.564 seconds ( 0.11%) : _md_timer_profiling_hz
0.563 seconds ( 0.11%) : netconn_recved
0.553 seconds ( 0.11%) : tcpip_thread
0.543 seconds ( 0.11%) : _os21_callback_intr_enter
0.530 seconds ( 0.10%) : malloc
0.510 seconds ( 0.10%) : pbuf_pool_is_empty
0.485 seconds ( 0.09%) : task_unlock
0.474 seconds ( 0.09%) : lwip_htons
0.473 seconds ( 0.09%) : netconn_recv_tcp_pbuf
0.435 seconds ( 0.08%) : lwip_read
0.434 seconds ( 0.08%) : sys_sem_signal
0.424 seconds ( 0.08%) : tcp_send_empty_ack
0.420 seconds ( 0.08%) : _os21_scheduler_pri
0.401 seconds ( 0.08%) : pbuf_take
0.396 seconds ( 0.08%) : low_level_output
0.387 seconds ( 0.08%) : lwip_close
0.375 seconds ( 0.07%) : task_lock
0.350 seconds ( 0.07%) : inet_chksum_pseudo_partial
0.340 seconds ( 0.07%) : wrapper
0.329 seconds ( 0.06%) : ip_route
0.319 seconds ( 0.06%) : tcp_update_rcv_ann_wnd
0.259 seconds ( 0.05%) : _scheduler_idle
0.253 seconds ( 0.05%) : release_buffer
0.252 seconds ( 0.05%) : mac_link_callback
0.234 seconds ( 0.05%) : lwip_ntohs
0.233 seconds ( 0.05%) : vmem_min_page_size
0.217 seconds ( 0.04%) : netconn_getaddr
0.217 seconds ( 0.04%) : _md_kernel_text_info
0.208 seconds ( 0.04%) : stxmac_dma_rx_abort
0.206 seconds ( 0.04%) : tcpip_apimsg
0.202 seconds ( 0.04%) : virtlay_NotifyDisplaySettings
0.196 seconds ( 0.04%) : lwip_ntohl
0.148 seconds ( 0.03%) : pbuf_realloc
0.139 seconds ( 0.03%) : _os21_kernel_state
0.135 seconds ( 0.03%) : stvtg_InterruptHandler
0.134 seconds ( 0.03%) : free
0.125 seconds ( 0.02%) : stevt_CallBothNotifyCB
0.124 seconds ( 0.02%) : _os21_callback_intr_exit
0.093 seconds ( 0.02%) : netconn_gethostbyname
0.089 seconds ( 0.02%) : etharp_send_ip
0.077 seconds ( 0.01%) : time_plus
0.073 seconds ( 0.01%) : sys_mbox_post
0.068 seconds ( 0.01%) : STEVT_NotifyWithSize
0.061 seconds ( 0.01%) : lwip_htonl
0.053 seconds ( 0.01%) : semaphore_value
0.051 seconds ( 0.01%) : VDP_IsUpdateNeeded
0.048 seconds ( 0.01%) : sys_arch_mbox_tryfetch
0.042 seconds ( 0.01%) : time_ticks_per_sec
0.042 seconds ( 0.01%) : device_check_args
0.042 seconds ( 0.01%) : LayerVideo_VSync
0.040 seconds ( 0.01%) : vdp_NotifyDisplaySettings
0.039 seconds ( 0.01%) : stlayer_VSyncCallback
0.033 seconds ( 0.01%) : STOS_SemaphoreSignal
0.028 seconds ( 0.01%) : ipaddr_ntoa_r
0.025 seconds ( 0.00%) : stevt_CheckID
0.024 seconds ( 0.00%) : task_context
0.023 seconds ( 0.00%) : EventChange
0.021 seconds ( 0.00%) : FastHardwareManagerTask
0.016 seconds ( 0.00%) : STDVMi_ServiceTask
0.015 seconds ( 0.00%) : lwip_recv
0.012 seconds ( 0.00%) : stxmac_phy_busy_wait
0.012 seconds ( 0.00%) : STOS_SemaphoreWaitTimeOut
0.010 seconds ( 0.00%) : VIRTLAY_IsUpdateNeeded
0.010 seconds ( 0.00%) : stlayer_HardwareManagerTerm
0.008 seconds ( 0.00%) : _malloc_trim_r
0.007 seconds ( 0.00%) : etharp_query
- [lwip-users] TCP bandwidth limited by rate of ACKs, Mason, 2011/10/10
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Simon Goldschmidt, 2011/10/10
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Mason, 2011/10/11
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Kieran Mansley, 2011/10/11
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Mason, 2011/10/11
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Kieran Mansley, 2011/10/11
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs,
Mason <=
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Bill Auerbach, 2011/10/12
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Mason, 2011/10/12
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, address@hidden, 2011/10/12
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Mason, 2011/10/13
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, address@hidden, 2011/10/13
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Mason, 2011/10/17
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, address@hidden, 2011/10/17
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Mason, 2011/10/18
- Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Simon Goldschmidt, 2011/10/18
Re: [lwip-users] TCP bandwidth limited by rate of ACKs, Bill Auerbach, 2011/10/10