Robots with multi-sensors always have a problem of weak pairing among different modals of the collected information produced by multi-sensors, which leads to a bad perception performance during robot interaction. To solve this problem, this paper proposes a Force Vision Sight (FVSight) sensor, which utilizes a distributed flexible tactile sensing array integrated with a vision unit. This innovative approach aims to enhance the overall perceptual capabilities for object recognition. The core idea is using one perceptual layer to trigger both tactile images and force-tactile arrays. It allows the two heterogeneous tactile modal information to be consistent in the temporal and spatial dimensions, thus solving the problem of weak pairing between visual and tactile data. Two experiments are specially designed, namely object classification and slip detection. A dataset containing 27 objects with deep presses and shallow presses is collected for classification, and then 20 slip experiments on three objects are conducted. The determination of slip and stationary state is accurately obtained by covariance operation on the tactile data. The experimental results show the reliability of generated multimodal data and the effectiveness of our proposed FVSight sensor.