The detection and attribution of long-term patterns in hydrological time series have been important research topics for decades. A significant portion of the literature regards such patterns as ‘deterministic components’ or ‘trends’ even though the complexity of hydrological systems does not allow easy deterministic explanations and attributions. Consequently, trend estimation techniques have been developed to make and justify statements about tendencies in the historical data, which are often used to predict future events. Testing trend hypothesis on observed time series is widespread in the hydro-meteorological literature mainly due to the interest in detecting consequences of human activities on the hydrological cycle. This analysis usually relies on the application of some null hypothesis significance tests (NHSTs) for slowly-varying and/or abrupt changes, such as Mann-Kendall, Pettitt, or similar, to summary statistics of hydrological time series (e.g. annual averages, maxima, minima, etc.). However, the reliability of this application has seldom been explored in detail. This paper discusses misuse, misinterpretation, and logical flaws of NHST for trends in the analysis of hydrological data from three different points of view: historic-logical, semantic-epistemological, and practical. Based on a review of NHST rationale, and basic statistical definitions of stationarity, nonstationarity, and ergodicity, we show that even if the empirical estimation of trends in hydrological time series is always feasible from a numerical point of view, it is uninformative and does not allow the inference of nonstationarity without assuming a priori additional information on the underlying stochastic process, according to deductive reasoning. This prevents the use of trend NHST outcomes to support nonstationary frequency analysis and modeling. We also show that the correlation structures characterizing hydrological time series might easily be underestimated, further compromising the attempt to draw conclusions about trends spanning the period of records. Moreover, even though adjusting procedures accounting for correlation have been developed, some of them are insufficient or are applied only to some tests, while some others are theoretically flawed but still widely applied. In particular, using 250 unimpacted stream flow time series across the conterminous United States (CONUS), we show that the test results can dramatically change if the sequences of annual values are reproduced starting from daily stream flow records, whose larger sizes enable a more reliable assessment of the correlation structures.