With the increasing capacity addition of solar energy systems, solar forecasting is vital and cost-effective to mitigate solar variability and to support their operation. The temporal fusion transformer (TFT) has shown great potential in both solar irradiance and power output forecasting using multiple one-dimensional time series data. Since spatiotemporal information is more beneficial for solar forecasting, this work applies a simple yet effective way to incorporate two-dimensional spatiotemporal satellite- and numerical weather prediction (NWP)-based inputs with TFT for more skillful irradiance forecasts. Results show that spatiotemporal inputs with simple spatial averaging can generally lead to better irradiance forecasts with 4-h ahead skill scores up to 12.24%, compared to the use of single-location data. The benefit of using spatiotemporal information is more pronounced for forecasts under cloudy conditions, whereas it might result in some misrepresentations when the sky is clear or less cloudy. NWP data can generally be used to improve the intra-day solar forecasting performance with TFT, and the interpretability analysis shows that NWP irradiance products have a larger impact (up to 22.07%) on the overall results. Although NWP products are beneficial for intra-day solar forecasting when integrated with satellite-based data, their influences under different sky conditions and forecast horizons might be different. A proper analysis of these impacts should be performed and interpreted in practical applications for the reliability of energy systems. This work on improved irradiance forecasts with TFT and interpretability analysis is crucial for the operation of solar energy systems.