Visual simultaneous localization and mapping (visual SLAM) has been well developed in recent decades. To facilitate tasks such as path planning and exploration, traditional visual SLAM systems usually provide mobile robots with the geometric map, which overlooks the semantic information. To address this problem, inspired by the recent success of the deep neural network, we combine it with the visual SLAM system to conduct semantic mapping. Both the geometric and semantic information will be projected into the 3D space for generating a 3D semantic map. We also use an optical-flow-based method to deal with the moving objects such that our method is capable of working robustly in dynamic environments. We have performed our experiments in the public TUM dataset and our recorded office dataset. Experimental results demonstrate the feasibility and impressive performance of the proposed method.